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Video Colorization Diagnostics in Optical 
Telecommunications 


By H. M. PRESBY and R. CHANG* 
(Manuscript received October, 1981) 


A new diagnostic tool is used to study the outputs of light-emitting 
diodes, lasers, and optical fibers, and to provide quality control of 
optical fiber preforms. The object is observed with a video camera, 
and the monochrome signal is processed to synthesize a color display 
from the different intensity levels. The color representation allows 
features not previously observable to be readily recognized and char- 
acterized. Correlation between separated, but equally bright, regions 
of a video image is also easily achieved in that they are now rendered 
in the same color. A calibration procedure allowing quantitative 
information to be quickly obtained from the display is described. 


1. INTRODUCTION 


Video diagnostics is capable of playing an important role in the 
study and characterization of the components used in optical telecom- 
munications. The term itself refers to obtaining information from a 
video signal derived by viewing the object, or a property of the object 
of interest, with a video camera. Prime examples are observing the 
light output of light-emitting diodes (LEDs) or of fibers. The video 
image will then consist of a two-dimensional display of the intensity at 
the plane upon which the camera is focussed. For LEDs, this plane 
could correspond to the output face of the device so that the near-field 
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radiation pattern can be characterized and, in the case of fibers, it 
could be a screen upon which the fiber output is allowed to fall. 

The display contains a very large amount of information in the form 
of point-to-point variations in intensity. This information could of 
course be computer-processed to yield the desired data and, indeed, a 
host of such systems exist in both specialized’? and commercially 
available form. 

However, in many applications, this sophistication and associated 
expense is not required as the desired information could be obtained 
by simply observing the monitor display if some means existed of 
distinguishing the various intensity levels. Unfortunately, it is difficult 
for the monochrome monitor screen and the eye to discern small 
changes in grey scale and to correlate them over an extended two- 
dimensional image. On the other hand, the human visual system is 
very sensitive to small color changes, which can then be used as a 
means of image enhancement. 

This paper describes a method based on the latter effect to achieve 
distinction and subsequent correlation in a video image by converting 
different shades of grey into synthetic color signals. We apply the term 
video colorization to this procedure in analogy with video digitization, 
since the signal is represented by a discrete number of synthetic colors. 
By just observing the color display, all of the required information for 
many applications can then be obtained. Examples are given of studies 
of LED, laser, and fiber output light distributions and also of the 
internal structure of optical fiber preforms. 


Il. VIDEO COLORIZATION 


The advantages of converting monochrome displays into color is 
receiving considerable attention® and is making a major impact in 
computer graphics.* Not only does color separate and highlight infor- 
mation better but it also serves to direct attention to particular areas 
of interest. This latter point has been found to improve operator 
recognition and lower probability of operator error in situations where 
judgments are based on monitor displays. In addition to its laboratory 
research value, the system to be described here is well-suited to a 
manufacturing environment for use by personnel in product testing 
and quality control. 

Video colorization diagnostics is achieved with the arrangement 
shown in Fig. 1. A video camera observes the object or intensity 
distribution of interest either with a camera lens, through a microscope 
or by projecting directly onto the face of the vidicon itself. The vidicon 
is a silicon target type adjusted for linearity and uniformity of response. 

The output of the camera is sent to a video quantizer.” This instru- 
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ment is specifically designed to process the grey-scale characteristics 
of a monochrome video input signal to synthesize color signals from 
different shades of grey. It achieves this by dividing the input signal 
into a maximum of eight slices and assigning a color to each slice. The 
width of the slices is adjustable to any amplitude level between black 
and white, and the color assigned to each slice can be any combination 
of red, blue, and green. 

The video quantizer generates a test pattern on the red-green-blue 
color monitor to facilitate the above adjustments. Three such patterns 
which appear as vertical bands of color are shown in Fig. 2. The width 
of each band corresponds to a voltage range in the video signal. 
Optimal resolution is obtained if the total width of the bands is equal 
to the total voltage range of the input signal. 

Figure 2a shows the condition in which each band has an equal 
width and the intensity progression goes from black to white with the 
colors in the order, black, red, yellow, green, light blue, dark blue, 
violet, and white. Thus, the darker parts of the image will be rendered 
black and red and the brighter segments in white and violet, with the 
other colors falling in between. 

Figure 2b shows the test pattern setup for a logarithmic response 
and Fig. 2a, for equal brightness contour generation. In the latter case, 
as will be shown later, the image on the monitor is composed of a series 
of equal brightness contours. 

The logarithmic response is of particular value in calibrating the 
conversion of grey scale into color. Figure 3a shows the monochrome 
display of a neutral-density step wedge as observed by the video 
camera. Eight steps are included in the field of view varying in density 
from 0.22 to 1.36, giving a range of relative brightness from 100 to 7 
percent. Note that only four of the steps are clearly resolved. With the 
test pattern set for a logarithmic response to match the density steps 
of the wedge, Fig. 3b is observed. Note that now all eight steps are 
seen and that a particular color can be associated with a given grey- 
scale value. We now present the results of applying this system to a 
variety of objects. 


lil. OPTICAL FIBER OUTPUT 


The output radiation pattern of a multimode optical fiber consists 
of a complex configuration determined by the mutual interference 
effects of the propagating modes. A typical pattern observed for laser 
excitation of the fiber is shown in monochrome form in Fig. 4a. The 
main feature noted is patches of light of varying intensity. Generally, 
from a systems perspective it is only the total power in this distribution 
which is of interest; however, recent attention being paid to modal 
noise makes details of the pattern important.® The intensity distribu- 
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tion is also of interest in characterizing the fiber or mode distributions 
themselves, and properties of it can be used in multimode fiber 
interferometry. 

Characterizing the light output distribution or focusing attention 
upon a specific feature of it is made possible with video colorization. A 
colorized intensity pattern is shown in Fig. 4b. Regions of equal 
intensity are now clearly delineated and the brightest patches of light 
stand out in white. For comparison, an equal brightness contour 
display of the pattern is shown in Fig. 4c. 

The output light distribution is extremely unstable to small pertur- 
bations of the fiber. Touching the fiber at any point along its length 
dramatically changes the observed pattern. Even air currents are 
sufficient to perturb the distribution as displayed in Fig. 5, which 
shows the output as observed over a several-minute time interval 
under normal ambient room conditions. If the total power of this 
distribution is not detected (as may occur with certain fixed apertures 
or splices in a fiber system), these variations become an important 
source of modal noise and video colorization offers a way to study the 
effects. Video colorization also opens up other possibilities in that now 
a specific feature of the pattern can be easily monitored and its changes 
as a function of external stimuli recorded. This allows multimode fibers 
to be used in sensitive interferometric-like detection schemes to mon- 
itor quantities like, pressure, temperature, or even electric and mag- 
netic fields.’ 


IV. LIGHT-EMITTING DIODES AND LASERS 


Video colorization is ideally suited to characterize the light outputs 
of LED and laser sources. The output consists of a two-dimensional 
intensity distribution which can be immediately analyzed once it is 
colorized. The LED devices are observed with a microscope focussed 
on their emitting surface to obtain the near-field light distribution. 

Figure 6 shows a poor quality LED observed in black and white and 
in colorized form. The most intense emission regions are clearly seen 
in the colorized photos which show the output of the device as a 
function of increasing current. 

An LED with a relatively uniform output light distribution is shown 
in Fig. 7. Note that the colorized display shows some nonuniformity in 
the contact region (near the upper left) and a small bright area in 
violet just to the left of center. 

Two views of the output of a GaAlAs laser are shown in Fig. 8. 
Figures 8a and 8b show the near-field pattern, and Figs. 8c and 8d 
show the far-field pattern; the latter is obtained by focussing on a 
ground glass upon which the radiation impinged. 
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V. PREFORM STRUCTURE 


As is well known, the quality of the refractive index profile of an 
optical fiber determines, in the absence of mode coupling, the band- 
width of the medium. In the modified chemical vapor deposition 
(MCVD) process, the profile is built up in a layered fashion by varying 
the dopant concentration in the deposited layers. Layer structure, 
then, is an important indicator of profile quality and techniques have 
been developed to render it visible for inspection in the preform state.® 

One way of investigating the layer structure is by immersing the 
preform in matching oil and observing the core with a video camera.? 
A preform observed in this manner is shown in Fig. 9a. The central 
dark band outlined with two bright lines is the axial index depression. 
The two bright lines further out arise from a deposition layer with an 
index much higher than that of the adjacent layers and represent a 
deposition flaw. While this perturbation is visible in the monochrome 
display, its presence can be enhanced by video colorization as shown 
in Fig. 9b. In this display, the test pattern was set so that all grey 
levels below a certain value would be rendered in red. Grey levels 
above this threshold, representing deposition flaws, are rendered in 
yellow. Thus, a difinitive indicator is provided to check preform 
quality. 

Similar observations on a single-mode preform are shown in Fig. 10. 
Figure 10a shows in monochrome the GeO»-doped core in the lower 
region and the phosphosilicate layers above it which make up the 
deposited cladding. Figure 10b is a colorized display with the test 
pattern set to show up the uniform deposition structure in the cladding. 
For Fig. 10c, the test pattern was adjusted to best show uniformity of 
the core deposition. 

Video colorization is also an aid in viewing the layer structure as 
observed in preform slab samples. Figure 11 shows such a sample and 
how colorization helps in making visible the fine details within the 
layers. 

As a last example, consider Fig. 12 which demonstrates a use of 
general utility. The monochromatic display in Fig. 12a comes from 
viewing the output of a collimated tungsten-halogen light source 
intended to supply a uniform intensity distribution. How well this 
uniformity is achieved is clearly seen in the colorized display of Fig. 
12b. We, thus, have a very sensitive procedure to test background or 
incident light distributions for uniformity. 

In conclusion, we have seen that video colorization can be used as a 
tool that can enhance and, in some cases, even make possible the rapid 
interpretation of video images. We have investigated images arising 
from a number of areas related to optical fibers and are confident that 
this technique can be used to advantage in many other studies as well. 
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Fig. 1—Arrangement for video colorization diagnostics. 
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Fig. 2—Test pattern of video quantizer set for (a) bands of equal width; (b) logarithmic 
response, and (c) equal brightness contour generation. 
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Fig. 3—Neutral density step wedge observed (a) monochromatically and (b) and (c) 
by colorization. Note that all eight steps are clearly seen in the latter case. 
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Fig. 4—Output light distribution of multimode optical fiber observed (a) monochro- 
matically, (b) by colorization, and (c) by equal brightness contours. 
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Fig. 5—Variations observed in output light distribution of multimode optical fiber as 
a function of several minutes’ time under ambient room conditions. 
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Fig. 6—Near-field light output of poor quality LED observed in black and white and 
in colorized form as a function of increasing current [(a) through (e)]. Note the changes 
in the brightest emitting regions shown in white. 





Fig. 7—Light-emitting diode with relatively uniform light output observed in non- 
colorized and colorized form. 
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Fig. 8—Near-field [(a) and (b)] and far-field [(c) and (d)] light distribution of GaAlAs 
laser. 
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Fig. 9—Multimode optical fiber preform with layer perturbation, observed in (a) 
black and white and (b) colorization. Th 


e colorized form showing how significant process 
deviations can be readily detected. 


VIDEO COLOR DIAGNOSTICS 279 


Sot isin ander eat baigR 





Fig. 10—Observations of sin 
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Fig. 11—Slab sample from tip of optical fiber preform showing improved rendering of 


layer structure achieved with video colorization. 
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Fig. 12—Light output of tungsten-halogen source showing how nonuniformities are 
clearly seen with video colorization. 
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Measurements of OH Diffusion in 
Optical-Fiber Cores 


By D. L. PHILEN 
(Manuscript received September 25, 1981) 


A color-center laser was used at 0.95 um to determine the activation 
energy for OH diffusion in the cores of current multimode fibers. The 
temperature range of 600 to 800°C was investigated and the activa- 
tion energy determined to be 19,700 + 1600 cal/mole in agreement 
with previous values reported for bulk silica. Based on this value, 
some 23,000 years would be necessary at 90°C to produce a 3-percent 
change in the 0.95-um OH absorption at the core center. Apparently, 
OH diffusion over a service life of optical fibers should not be a 
problem in presently envisioned lightwave applications. 


1. INTRODUCTION 


The diffusion of water (H2O) into optical fiber cores can affect both 
fiber loss’ and bandwidth.” Water contamination of the fiber core could 
occur during three different periods—preform manufacture, fiber 
drawing, and long-term environmental exposure. The first two have 
previously been studied;*° however, environmental data on fused silica 
require longer exposure times® because of slow diffusion rates. It would 
be ideal to have an activation energy which would allow the calculation 
of concentration changes as a function of temperature. Thus, the effect 
of water diffusion in a fiber could be calculated based on the expected 
operating conditions for that fiber. Reported here is the first measure- 
ment of the activation energy for OH diffusion in a fiber core. 

In the transmission wavelength region of current interest, 0.8 to 1.5 
pm, two major OH absorption bands appear. These bands are at 1.38 
and 0.95 pm and are, respectively, the first and second overtone bands 
of the fundamental stretching vibration at 2.7 1m. Two candidates for 
system wavelengths are 0.87 and 1.3 ym—very close to the OH absorp- 
tion bands. A 10-ppm increase of OH would increase the loss by about 
12 dB/km at 0.95 ym, 1.0 dB/km at 0.90 ym,’ and 7 dB/km at 1.3 ym. 
Thus, knowledge of the OH diffusion rate in optical fiber waveguides 
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becomes increasingly important as attention shifts to longer wave- 
lengths. 


ll. THEORY 


Water probably does not exist as a molecule of H2O in glass. On 
attacking a silica surface, water undergoes the following reaction: 


HO + (=Si—O—Si=) glass = 2(=Si—OH) glass. 


This process is interpreted as the conversion of one molecule of water 
into two OH groups by the addition to an Si—O—Si bridge to form 
two (Si-—-OH) groups. This mechanism is supported by data indicating 
the solubility of water in silica (concentration) to be proportional to 
the square root of the vapor pressure.’ 

Diffusion is then viewed as the reformation of the Si—O—Si bridge 
and the transfer of the OH group to an adjacent bridge. 


(=Si,—OH) + (=Sis—O—Sip=) @ 
(=Sip—O—Sia=) + (=Sip—OH). 


This process is unlike ordinary diffusion where preferred sites are 
involved, and it is probably incorrect to refer to OH in glass as OH 
ions or OH radicals since charged ions or free radicals are unlikely to 
exist in the above process. The measured loss attributed to OH is 
simply that arising from the O—H stretching vibration of the OH 
functional group. 

The diffusion of a line source located at the origin in cylindrical 
coordinates is given by: 


= a —r?/4Dt 
C(r, t) = aap e (1) 
(see Appendix A for derivation),’ where 

a = concentration per unit length of the line source 

C = concentration (mole fraction) at radius r and time t 

D = diffusion coefficient 

r = distance from the line source. 
If 7 is in cm and ¢ in seconds, then D has dimensions of cm?/s. The 
diffusion coefficient is a function of temperature and is given by the 
following equation: 


D = Doexp(—E/RT), (2) 


where 
Do = a constant 
E = the activation energy 
R = gas constant in cal mole™ kelvin™ 
T = temperature in kelvin. 
Equations (1) and (2) are used to obtain the activation energy E. 


1 
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Two problems of interest are the diffusion of water from the cladding 
into the core of the fiber and from water surrounding the fiber into the 
cladding and the core. Since light does not propagate with low loss in 
the cladding, optical loss measurements in the cladding are difficult. 
Measurement of the diffusion of water out of the core region is easier 
since optical loss measurements in the fiber core are used for fiber 
diagnostics. Because of microscopic reversibility, the diffusion coeffi- 
cient inside the core is the same for diffusion into or out of the core. 
This approach avoids surface recombination effects observed when 
H.O diffuses onto and off a silica substrate.'° The question we address 
is, Given that water is already inside the fiber, what is its diffusion 
rate into or out of the fiber core? 


ill. EXPERIMENTAL 


The experimental setup illustrated in Fig. 1 incorporates a tunable, 
single-mode, color-center laser.’’ This laser is pumped by a continuous 
wave krypton-ion laser and is tuned to the OH absorption peak at 0.95 
pm. The reel of fiber chosen for test was 1.9 km long and had 
abnormally high loss at 0.95 um because of OH absorption. Fifty 
meters were reeled off and the OH spatial profile determined at the 
end of that 50-meter section by measuring the absorption component 
of the loss calorimetrically, while traversing the color-center laser 
beam across the fiber core. About 100 meters more were reeled off and 
the OH spatial profile measured again at the end of the 100-meter 
section. These two spatial profiles agree with each other within the 
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Fig. 1—Experimental setup for OH diffusion measurement. 
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Fig. 2—The 0.95-ym loss because of OH in the control fiber. 


error of the calorimeter measurement (+0.2 dB/km). This 100-meter 
section was used as the fiber for the diffusion experiments. Finally, the 
spatial profile was measured at the beginning of the 100-meter reel. 
This represents the control fiber and is illustrated in Fig. 2. It shows 
the OH absorption profile to be peaked at the fiber core center and 
decreasing towards the core-cladding boundary. These precautions 
ensured that the OH distribution was uniform over the length of the 
test fiber and ensured the same initial conditions for each test. This 
fiber had a germania-phosphosilicate core, with A = 1.3 percent, and 
1-2 percent phosphorus added for fining; its core was about 40 pm in 
diameter. 

A number of two-meter samples were cut from the 100-meter fiber 
length. The coating was stripped from the samples, and each sample 
was heated in an oven for one day at a fixed temperature. Six temper- 
atures were chosen to give measurable diffusion rates: 600, 655, 680, 
700, 750, and 800°C. The oven temperature was monitored with a 
Chrome-alumel thermocouple and held constant within +2°C. After 
removal from the oven, each sample was placed in the calorimeter and 
the OH spatial profile measured. The spatial resolution is limited by 
the focused spot size to about 5 pm. 

Since this fiber has a core diameter of only 40 pm, the scattering loss 
due to the core-cladding boundary increased dramatically in the region 
near 20 um. In Fig. 3 the spatial profiles for four of the temperatures 
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and for the unexposed control sample show how the absorption profile 
changes with increasing temperature. In the region of 15 ym, a rising 
loss versus radius may indicate in-diffusion from the cladding for the 
800°C sample; however, this indication is not evident in the lower 
temperature samples. The rise is negligible at r = 10 pm, even for the 
800°C sample; therefore, we neglected in-diffusion effects from the 
cladding in analyzing the loss measured at r = 0. 


IV. RESULTS AND DISCUSSION 


In evaluating the results, one must consider the following effects: 

(t) The fiber as produced already contains diffused water as a result 
of preform manufacture and fiber drawing. Unfortunately, we do not 
know how long the preform or fiber was exposed to each temperature 
in this process. As a result, the OH in the starting fiber already has a 
Gaussian distribution rather than the line source distribution that 
would simplify the analysis. 

(tt) In the measurement method used, the laser was focused to a 
spot on the fiber face and the loss at each radial position determined 
calorimetrically. Determining the concentration of OH by evaluating 
the loss at radius r in the fiber core is not straightforward since the 
exciting beam, localized at r = r,, excites modes which cover a large 
range of r. To make the analysis tractable, only losses measured for 
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Fig. 3—Diffusion profiles for four of the exposure temperatures. 
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the r = 0 excitation condition will be used in deriving the activation 
energy. 

Despite these complications, Appendix B shows that yin and you(T), 
the 0.95-um absorption losses at r = 0 before and after exposing the 
fiber for one day at temperature T, are related by: 


E 
In[{ (yin/Your(T’)) — 1] = constant RT’ (3) 

Figure 3 indicates that the r = 0 loss measured for the control fiber 
and the 600°C exposure are very nearly the same. Because of the 
uncertainty associated with each measurement, and their similar mag- 
nitude, the expression In[{(yin/Yout(T) — 1] has a large uncertainty 
associated with it. Thus, the 600°C temperature was omitted from the 
data analysis. 

Equation (3), plotted for the five remaining exposure temperatures 
in Fig. 4, should result in a straight line of slope (-E/R). The value of 
the activation energy determined from the slope is 19,700 + 1600 cal/ 
mole or 0.854 + 0.07 electron volts. While an uncertainty exists in the 
activation energy from thermodynamic considerations alone, the over- 
all uncertainty in the measured value is the uncertainty in the slope. 


1 


SLOPE = -E/R 


in[( Yo / Yr) -1] 


-2 
0.9 1.0 1.1 
1000/7 IN KELVIN 





Fig. 4—Plot of eq. 3. The slope gives the activation energy. 
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It is this calculated uncertainty in the slope which is +1600 cal/mole. 
This value for the activation energy compares quite favorably with the 
values 17,300 + 2000 cal/mole and 18,300 + 500 cal/mole determined 
between 600 and 1200°C in bulk silica by Moulson and Roberts.’° They 
determined two different diffusion coefficients—one for absorbing of 
the OH centers, 


D = (1.0 + 0.2) X 10-%exp(—18,300 + 500)/RT cm’ s™! 
and one for removal, 
D = (2.7 + 1.0) X 10~’exp(—17,300 + 2000)/RT cm’ s7?. 


They point out, however, that the activation energy is the same 
within their experimental error and our value of 19,700 + 1600 cal/ 
mole overlaps theirs once the experimental uncertainty is taken into 
account. 

Moulson and Roberts attribute the difference in the pre-exponential 
factor (1.0 + 0.2) xX 10°° for absorbing versus (2.7 + 1.0) x 107’ for 
removal, to the process 


H,0 + Si—O—Si 2 2(Si—OH) 


being slightly exothermic by 6 kilocalories. Also, the recombination of 
hydroxyl groups at the glass surface may introduce an appreciable 
nonzero concentration of water just below the glass surface—an as- 
sumption made for their calculations, but unnecessary in this study. 

From eq. (18), we can relate the different diffusion times for various 
temperatures. Suppose one fiber is exposed for time ¢ at a temperature 
where the diffusion coefficient is D, and a second fiber for time t’ at 
D’. If the fibers are initially identical (yin = yin in Appendix B) and the 
exposures are equivalent [Your(T) = Your(T’)], then eq. (18) gives Dt = 
D’t’. From Fig. 3, apparently a temperature of about 600°C is required 
to produce diffusion in one day that results in a 3-percent change in 
the absorption at 0.95 um. Using our measured value of E, the time 
required for this to occur at the highest acceptable exposure temper- 
ature in the field (T’ = 90°C) is t’, where 


Deoorc _ exp(—E/RT) _ t’ _ 1.16910 


Dore = exp(—E/RT’) I1day 1.875 x 10°” 
t’ = 23,000 years 


which greatly exceeds the projected 40-year service life for an installed 
cable. Since the OH absorption loss at 1.3 ym is of the same order of 
magnitude as at 0.95 um, this confirms that OH diffusion is not likely 
to be a problem for presently envisioned applications.° 

We have made many assumptions in analyzing these data. We have 
assumed the diffusion coefficient to be independent of concentration, 
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and over the small concentration changes observed here, this is prob- 
ably a valid assumption. We have assumed the activation energy to be 
independent of temperature. In addition, the equation used to model 
the data does not account for changes in viscosity or molal volume of 
the glass. Work is progressing to extend these measurements to lower 
temperatures to check for a temperature dependence of the activation 
energy. At a temperature of 325°C, this extension will require 6.2 
months of exposure in the oven if the present calculations are correct. 

Data on viscosities and molal volumes for ternary glasses like 
GeO.—P20;—SiO, are sparse or nonexistent. Moreover, little is known 
about the prediction of liquid diffusivities, to say nothing of diffusivities 
of very high viscosity materials such as glass. Refer to Refs. 12 and 13 
for excellent texts on the subject of liquid and high-density transport 
phenomena. 


V. SUMMARY 


The OH profile and activation energy for diffusion have been meas- 
ured in a fiber core using a color-center laser and calorimeter. The 600 
to 800°C results presented here agree with the bulk silica results of 
Moulson and Roberts. Thus, the compositional gradients, internal 
stresses and geometry peculiar to the fiber apparently have no signif- 
icant effect on the activation energy. This procedure is a novel ap- 
proach to measuring diffusion and adds confidence that long-term OH 
diffusion will be no problem in currently proposed lightwave systems. 
Plans are being made to extend these measurements to even lower 
temperatures. 
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APPENDIX A 
Derivation of the Diffusion Equation 


In 1855, Adolph Fick empirically proposed that diffusion in one 
dimension was described by: 


J,=—-D es (4) 
ox 
where 
J, = the flux in the x direction per unit time 
C = concentration of the diffusing species 
D = diffusion coefficient. 
Consider a volume element of material between x and x + dx having 
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unit cross-sectional area as shown in Fig. 5. The flux into the element 
minus the flux out equals the rate of accumulation of a species within 
the element. 
ac 

J =a I x+dx = “at dx, (5) 

where 
C = Average concentration in the element 
Cdx = Total amount in the element at time ¢. 

Expand +a, about x in a Taylor series 


Ody oJ, (dx) 
+ oe. 














xtdx = dx + + 6 
wees ax ax” 2 ” 
In the limit as dx — 0, eq. (5) has the form 
dd, dC 
- =—., (7) 
Ox ot 
This is simply a treatment of the conservation of matter. Combining 
eqs. (7) with (4) yields 
ri) aC ac 
tee py ae 
Ox ( <) ot (8) 
If D does not vary with x, we have 
ac ac 9) 
ax? at’ 


Also, if D does not vary with ¢, this is a linear partial differential 
equation known as Fick’s Second Equation. 
In the case of a fiber with cylindrical geometry, (9) becomes, 





s ac 
DV? C=—. (10) 
ot 
This has the solution 
C(r, t) = —— exp[—(r?/4Dt)] rr > 0, ¢>0, (11) 
4n Dt 


where a is a constant relating to the initial concentration. 


Jy ——< ae Ix + x 


x x+dx 


Fig. 5—Diffusion through a unit volume element. 
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APPENDIX B 
Derivation of the Activation Energy 
The distribution of power in a fiber core when the beam is focused 
at r = 0 is approximated by: 
P(r) = Poexp[—(r/ro)’]. (12) 
The OH concentration for diffusion of a line source into a cylinder was 
derived previously [eq. (11) ]: 
a 


Ce 4n Dt 


exp[—(r?/4Dt)], (13) 





when the laser beam is focused at r = 0, the parameter that is measured 
is the 0.95-um absorption loss y(t): 


y(t) = K| P(r)C(r, t)(2zrdr) (14) 


a = core radius 
K =aconstant. 


Define a new constant: 


r= : + me tha (15) 
*"\72 " 4Dt ; 
Now substitute (12) and (13) into (14): 


KMP>» {* : 
Di [ rexot (r/r:)]°dr 


KMPori 
~ ier {1 — exp[—(a/r:)"}} . (16) 





y(t) = 


If a > ri, then exp [— (a/ri)7] « 1; therefore, (16) becomes 
KMPor2 


——_. 1 
ADt + r2 (7) 


y(t) = 


The a > r, assumption is equivalent to the common practice of 


replacing 
| with i : 
0 i) 


exp[—(a/ri)”] < 0.02 


If ro < a/2, then 


(regardless of a?/4Dt), and the assumption is valid. 
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If yin is the loss when the fiber enters the oven and Yout is the loss 
when the fiber is removed from the oven one day later, then eq. (17) 
gives 


Vout — Yin =4D~x (1 day)/KMPoré. (18) 


The diffusion coefficient at the oven temperature T is D = 
Doexp(—E/RT), where E is the activation energy and R is the molar 
gas content. Multiplying (18) by yin and taking the In then gives 


In[(yin/Yout) — 1] = constant —- E/RT. (19) 
Plotting 
In[{(yin/You.) — 1] 
versus 1/T gives a straight line of slope —E/R. 
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A Class of Approximations for the Waiting 
Time Distribution in a GI//G/1 Queueing 
System 
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Single server queueing models are important in the study of a wide 
variety of stochastic systems. A particularly important example is the 
study of task schedules for computer systems with real-time applica- 
tions. In this paper, we present a class of approximations for the 
waiting time distribution in single server queueing systems with 
general independent (renewal) input and general (independent) ser- 
vice time distributions. These approximations allow the analyst to 
use as much (or as little) of the structure of the input and service 
processes as desired. Moreover, they also allow him to concentrate 
on specific quantities associated with the delay distribution that may 
represent the most relevant performance criteria. Examples include 
probability of delay, mean delay, and tails of the delay distribution. 
The methods given in this paper have been used to analyze the 
performance of task schedules in a variety of processor- based systems. 


1. INTRODUCTION 


Single server queueing models arise quite naturally in the study of 
a wide variety of stochastic server systems. A particularly important 
class is single processor computer systems, such as stored program 
control switching systems or nodes in a data communication network. 
In many such applications, it is important to keep as much of the 
structure of the interarrival and service time processes as possible in 
order to obtain realistic results; that is, the simplifying assumptions 
of exponential distributions cannot be made. While the resulting 
GI/G/1 queue may be extremely difficult to analyze, one is often 
content with reasonable approximations that incorporate the main 
features of the problem. In addition, one often desires results that are 
reasonably simple analytically, since the behavior of the GI/G/1 
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queueing model may be the input to the analysis of a more complex 
system. 

Our main purpose here is to present a class of approximations for 
the waiting time distribution, W(x), for such GI/G/1 systems which 
allows the analyst to use as much (or little) of the structure of 
the relevant input and service processes as desired. The resulting ap- 
proximations can be extremely simple in form (e.g., a single exponen- 
tial) or, with additional effort, more complex. In particular, we 
give relatively simple expressions for constants C and a, such that 
Wa(x) =1— Ce™® provides a good fit to the probability of delay, 
Pp = [1— W(0)] and the mean delay, w. Similar approximations are 
developed which provide a good fit to the tails of the delay distribution. 
Thus, these results can be used to study systems with a wide variety 
of performance criteria. Our results also lead to some new heavy traffic 
approximations, as well as a simple approximation for light traffic. 

While we feel that the approximations presented here are of use in 
themselves, we hope that the results obtained will stimulate more 
active research into the general method presented. Of particular inter- 
est is the problem of obtaining more quantitative error bounds to guide 
the user in the application of these techniques. 

We note that the methods given here have been used to analyze 
some rather complex GI/G/1 queueing systems which have arisen in 
the study of a certain class of computer systems with real-time appli- 
cations.” 

Some key results are summarized in the next section to give a 
general idea of the nature of this work. Our basic approach is intro- 
duced in Section III where several (single) exponential approximations 
are derived. In Section IV, we look briefly at the implications of the 
approximations for heavy traffic. In Section V, we look at some special 
cases which lead to analytic statements about the accuracy of our 
various approximations, and in Section VI, their accuracy is assessed 
via several numerical examples. Extensions to more general functional 
forms are considered in Section VII and illustrated via another nu- 
merical example. Some final remarks are given in Section VIII. A 
summary of our notation, key formulas, and approximations are given 
in the Appendix for easy reference. 


ll. SUMMARY OF SOME KEY RESULTS 


The main idea of this work is to assume a functional form for an 
approximation W,(x) to the true waiting time distribution W(x) for a 
GI/G/1 queue. The Lindley integral equation for W(x) is then used to 
obtain the unknown coefficients. We obtain several approximations by 
assuming W(x) = Wa(x) = 1 — Ce™. A key approximation of this 
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type, referred to as approximation A* has C and a determined by eas. 
(7) and (9) with a = ag (also see pages 299 and 300): 


Sete SoU 
1 — K_(az) 
and 
CC Us 


a a* K(0)—K-(az)’ 
where K(u) is the distribution function for the difference, 7%} between 
the service time and interarrival time and 


0 
K_(s) = | e*“dK(u) 


uly = | udK(u) 
0 
and ag satisfies the characteristic equation 
| e“=“dK(u) = 1. 


(The equation numbers of this section correspond to those of later 
sections.) Limiting properties and special cases for this approximation, 
as well as others are discussed. Several numerical examples are also 
given to illustrate the accuracy of these approximations. For an ex- 
ample H:/E3/1 system, the maximum relative error in the approxi- 
mation to Pp given by A®* is found to be 5.0 percent, while the 
maximum relative error in the approximation to w given by A* is 
found to be 1.5 percent. Perhaps the simplest approximation consid- 
ered (using only the first two moments of i) is the one referred to as 
approximation Ago, which is also of the form Wa(x) = 1 — Ce™™, but 
where now the constants C and a are determined by 


—2u 
a= ayH= “ene (14) 


i.e., the heavy traffic a, and 
C = Ceo = [1 — Erf(|a@|/V20u)//[1 + Erf(la|/V20u)], (15) 


where Erf(x) is the error function. Note that (15) has the limiting 
behavior for small (z) (assuming o, tends to a finite limit) 


*The symbol ~ will be used to denote a random variable (see the Appendix for a 
summary of notations used). 


QUEUEING SYSTEM APPROXIMATIONS 297 


2\u 
Ceo ~ 1-2 MA (16) 
u—-0 T Ou 
(p—1) 


which then results in the following limit for the approximate mean 


delay, Weo 
2 C 1 2 
Woo = Se ease ere ae onl? (17) 
QHy w>0 AH T 


(p—1) 





the right-hand side of eq. (17) is, in fact, a well-known lower bound for 
W (as u > 0, e.g., see Ref. 4). 

The accuracy of these approximations, as well as several others we 
develop, are studied in considerably more detail in what follows. We 
also show how the approach taken leads to a class of approximations 
which can be made increasingly more accurate (with additional effort). 


Il. DEVELOPMENT OF THE EXPONENTIAL APPROXIMATION 


We wish to obtain an approximate solution for the waiting time 
distribution, W(x), for a GI/G/1 queueing system. More specifically, 
we have a single server queue where the interarrival times, ¢, between 
customers are independent and identically distributed, taken from a 
general distribution, A(t) and the service times, 7, are independently 
drawn from an arbitrary distribution B(r). A first in-first out discipline 
is assumed. If we let d = 7 — ¢ and denote the distribution of a by 
K(u), then if an (equilibrium) waiting time distribution exists,’ it 
satisfies the well-known Lindley integral equation® 


W(x) = i W(x — y)dK(y), x20 (1) 


(W(x) is equal to 0 for x < 0). 
In this section, we consider approximations W,(x) to W(x) of the 
form 
Wa(x) =1—-Ce™. (2) 


Our objective is to determine suitable values for the constants C and 
a. For this purpose, we will use the relation (1). 


3.1 Pointwise matching 


As is well known (see Ref. 5, pages 376 and 410), if the equation 


| e“dK(u) = 1, (3) 


* Throughout we assume stability, ie., that (1/a) =7<t = (1/A); p = (A/a) < 1. 
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possesses a real nonzero root, az, then 


l= Wx) ~ Cae (4) 


Throughout this paper, we will tacitly assume that for all systems 
considered, eq. (3) does, in fact, possess such a root. Thus, if an 
exponential form is to be used, ag is a reasonable choice for the 
parameter a in (2). On the other hand, substitution of (2) into (1) 
yields the relation 


1-— Ce * = K(x) - corm | e™dK(y). (5) 


Hence, we see that if we choose a = ag, (5) will be valid to order 
0(e~**)' as x tends to infinity. We can now determine C by requiring 
that (5) be valid at another value of x. For x such that [1 — K(x)]>0 
we can solve (5) for C to obtain 


[1 — K(x) le” 


C(x) = = , 
1-| e“dK(y) 


(6) 


Now if we are interested in approximating the behavior of W(x) near 
x = 0, a reasonable choice might be to solve for C from (6) with x = 0; 
that is, choose 


1-K(0) _1-—K_(0) 


gee ee (7) 
1—K-(a) 1-K _(a) 


C= Co(a) = 
where 
0 
R_(s) = i e“dK(u). 


[Note that clearly 0 = Co(a) = 1.] 

We will refer to the exponential approximation of W(x) obtained by 
choosing a to satisfy (3) [“(5) at x = ©” ] and C to satisfy (7) [(5) at 
x = 0] as approximation Ago, i.e., Wa,, = 1 — Co(az)e “**. (The sub- 
script E refers to the use of the exact dominant root—we will consider 
other alternatives shortly.) 

Note that if the true delay distribution were given by a simple 
exponential, i.e., W(x) = 1 — Ce, then the value of C(x) determined 
from (6) would be identically a constant and W,, ,(x) = W(x). Thus, 
the variation of C(x) with x, as determined by (6), gives some indica- 


* That is, the difference between the left-hand side and right-hand side will tend to 
zero even when multiplied by Ce“*. 
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tion of how exponential the waiting time delay distribution is. A more 
“sophisticated” approximation might be obtained by minimizing the 
variation of C(x) with respect to some suitable norm, however, we will 
concentrate here mostly on simpler techniques. 

Note also that C(x) as given by (6) appears in various (exponential) 
bounds for W(x). For example, Kingman® and Rossberg and Siegel’ 
prove the following inequality for the complementary waiting time 
distribution, W°(x) = 1— W(x). 


Cre 7 = W(x) Se %*, 
where dg is given by (3) and 
Cr= inf C(x). 


x20 y, 
{1—K(x)]>0 


Ross? gives the bound 
W(x) = Cye “"*, 
where 
Cu = sup C(x). 
[I-K(x)]}>0 
(See also Ref. 9.) Hence, it seems quite natural to choose a specific 


value for x in C(x) to obtain an exponential approximation to W(x), 
with C, < C< Cv. 


3.2 An alternate method for determining C 


As we shall see shortly, Aro does provide a reasonable approximation 
to Pp, but the resulting approximation for w is not as good. One might 
consider choosing C from (6) with another choice of x; however, here 
we consider another alternative. If we compute the mean waiting time, 
Ww, using eq. (1) we obtain 


o-{ xd W(x) -| +| WoodK +f a. Ws — Ky) |. (8) 
0 0 —oo 


Again, assuming the exponential form (2) for Wa(x) [and, hence, 

that wa = (C/a)], (8) yields 

S.C iL (0) 

Dy SSS Se eee 
a K(0)—K-_(a) K_(0) — K_(a) 
where us = fo udK(u), R.(s) = fo e“dK(u) and K_(s) is as in (7). If a 
is chosen from (3) and C then from (9), we will refer to the resulting 
approximation as Ag,, i.e., Wa,,(x) = 1 — Ci(az)e“**. As we shall see, 
this will result in a good approximation for the mean delay via (9); 


(9) 
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however, it is not clear from this relation that the resulting C need be 
less than 1. (It is clearly =0.) A correct probabilistic exponential form 
for Wa(x) which incorporates both (7) (and, hence, results in a reason- 
able approximation for the probability of delay) and (9) (and hence 
results in a good approximation for the mean delay) can be obtained 
in the following manner. First, find ag from (3), then C = C* = Co(az) 
from (7), and then the approximate mean delay, w = Ci(az)/az from 
(9). The constant a is then chosen to be 


_ Co(az) 


Cilany 


a=a 


We will refer to this approximation as A*. We now look at a more 
organized way of obtaining such approximations. 


3.3 Moment matching 


Equation (9) was obtained by matching the first moment of the 
Lindley integral equation, i.e., from (8). We note that if we match the 
Oth moment in a similar way we are, in fact, also led to (7). That is, if 
we seek a solution to (1) of the form 1 — Ce, and match the first two 
moments (0th and Ist) of (1) to determine both unknown constants, 
we are led to eqs. (7) and (9), ie., 


_ 1-K() 
1— K_(a) 
C its 


a K(0)—K(a) 


as a pair of equations in the two unknowns C and a. These can be 
combined and rewritten as 


7 = LK) - K-@) 101 - KO] 


- x (10) 
us[1 — K_(a)] 
_ 1-K(0) a1) 
1 — K_(a) 


Thus, one can first determine a from (10) and then C from (11). 
Clearly, if such an a exists, a € [0, ©], C € [0, 1]. 
Denoting the right-hand side of (10) by f(a), we note that 


f(a) > 0 
a0 


_, KOU-KO]_ . 


a—>oo + 


f(a) 
and also 
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—K’(a)[1 — KO)? 
w[1 — K_(a)P 


f(a) 


has the properties 


f(a) > 90, a=0 


FO Ha si 





U+ 
f(a) > 0. 
Moreover, 
oat 1 = 2 7g _ "aa Fa 2 
f"(a) = [ KO) (K_(a)[{1 — K_(a)] + 2(K_(a)) 26°: wo. 


ts. [1 — K_(a)P 
Thus, we see that (10) possesses a unique positive root, which can be 
obtained via the iteration scheme 

An+1 = ‘id (an) 


starting from any positive ao. 
Thus, we can always determine the desired root ao, from (10) and 
then the resulting C = Co, from (11). This approximation 


Wa, (x) =1- Coie %"* 
is referred to as Ao. 


3.4 Light traffic 


Under quite general conditions, the function K_(a) defined in (7) 
and with a satisfying (3) has the limit 
K_(a) > 0, 
p—0 
e.g., fix B(r) and the “shape” of A(t) and let 1/A — . Hence, for p 
small, (7) and (9) lead to the following intuitively appealing “light” 
traffic approximations (Azr) 





Cir = 1 -— K(0) (12) 
~ Cir ts 
HS Gane ROY oe 


IV. HEAVY TRAFFIC 


In the case of heavy traffic, an obvious possibility for the exponent 
a is the heavy traffic limit (e.g., see Ref. 4): 


—2u 
ayn= TZ (14) 
Ou 
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where i, o2 are the mean and variance, respectively, of @ = 7 —t 
(service minus interarrival time). We can use this value for a in place 
of az in the approximations of Section III. In particular, if we use 
a = ay in (7) we will refer to the resulting approximation as Ano. Note 
that the resulting C is strictly less than one unless ay = 0 (p = 1). If we 
use a@ = ay in (9), we refer to the resulting approximation as Ay. 
These two approximations (as well as those we discuss next) offer 
potential improvements to the more standard heavy traffic approxi- 
mation that can extend its applicability to lower load levels. 

In many cases, where heavy traffic approximations are applied, the 
structure of K(u) is either not known or too complex to be used in 
further analysis. Standard heavy traffic approximations generally only 
make use of the first two moments of K(u), i.e., Z and 02. We can also 
obtain results which use only a, o2 by making the standard heavy 
traffic assumption that K(u) is approximately Gaussian. For example, 
if we not only assume that a = ay in (7), but further that K(u) is 
Gaussian, we obtain the following a for C using (7): 


1 — Erf 
Ceo =—___—__-. (15) 


oo En =) 


We refer to this approximation as Ago. Note that (15) implies the 
limiting behavior for |z| small 


2\u 
a \22 : 
os 1-224 (16) 
u-0 coals! u—0 T Ou 

(e>1) 1 + — (p—1) 


Now combining (16) with (14), we obtain the following (limiting) 
approximation to the mean delay for | z| small 


e C 1 2 
Woo = a ar). (17) 
QHy #0 AH 7 


(p— 1) 














In a similar manner, (9) yields the approximation Ag; 


ug Ae) ; zh, - Ea =) | 


C. 
We. = ee feel ee (18) 








which has the limit 
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= Ce, ou Ne 1 Ou [7 
tig. =—— ~ See ee Se 19 
Olan z0 2|a| 2 V2 z0an 2 V2 ny) 


(p> 1) (p> 1) 


Now if K(u) were, indeed, Gaussian, then the resulting (heavy 
traffic) mean delay, Wg, is known to have the limiting bounds (e.g., see 


Ref. 4). 
1 2 se 1 
—-— o.\/-s Wo =—. (20) 
QH T QH 


Compare (17), (19), and (20). 


V. SOME SPECIAL CASES 


In this section, we look at the general behavior of these approxi- 
mations for some special cases where exact results are readily obtain- 
able. We begin by looking at the GI/M/1 case where the approxima- 
tions of Section III are exact. This will serve as a simple illustrative 
example, as well as provide some additional insight into the behavior 
of the heavy traffic approximations of Section IV. 


5.1 Case: GI/M/1 
For a GI/M/1 system, we can write (for u > 0) 


K(u) = i: [1 — e*“*]dA(t) = 1 — pe, u> Od, (21) 
0 
where 1/a is the mean of the (exponential) service time and 
p= | : e“dA(t) = A(a). 
0 
We, thus, have 
us = ap [ xe “dx = p/a (22) 
0 


K(0)=1-—p. (23) 


Now if a satisfies (3), 
0 oo 
K_(a) = i e“dK(u) =1— | e™“dK(u). 
a 5 


Hence, 


K(a) =1- » | e“e “du = 1+ 


0 


ap 
a-a. 





(24) 
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Using (23), (24) in (7) yields 











CAC) (25) 
a 
while (22), (23), (24) in (9) yield 
Ci(a) - a—-a (26) 
a aa 


If a is chosen to be the exact root of (3), then these both lead to the 
exact waiting time distribution. (Recall, for a GI/M/1 system 


—a 
W(x) =1- — ee °E* 
a 


e.g., see Ref. 10.) 
If we use the heavy traffic approximation for a, i.e., (14) then (25) or 
(26) can be used to obtain the following approximation for C 


, 2h 
. of 2 (1-p) 


Cup = =1- : (27) 
Qa 











aon =p 
that is, a possible improvement to C = 1 for heavy traffic. Thus, the 
resulting mean delay is found to be 

Wn = — = — -- === --. (28) 
Note that this correction term to 1/ay as an approximation to the 


mean delay is typical. For example, for an M/G/1 system it is easy to 
show that the true mean delay, w, satisfies 


5.2 Case: M/G/1 


For this case, we have 
K(u) = | . ed“ dB(t) = ge™, u<0, (29) 
0 
where 1/A is the mean of the (exponential) interarrival time and 
ge | : e“dB(r) = B(A). 
0 


We, thus, have 
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oo fos) 0 
us, = | udK(u) = | udK(u) — | udK(u) 
0 se a 





_ Ue bee 1 14q 
-a- a] ue du = (=-3 +4) (30) 
K(0) = (31) 
and 
P i d 
K_(a) = qd [ e“edy = a x: (32) 


Using (7), we obtain the following approximation (Ago) for the 
probability of delay, Pp 
eee ee 
ee: 
Qr+A 


Co(az) = (33) 





while from (9) we obtain the following approximation (Ag) to the 
mean delay, w 


Z Ci(ae) _ (A — a + ga)(az + A) 


z= (34) 
ar qadar 


(Since we know that for an M/G/1 queue, the true value of Pp is p, we 
could use this fact, together with (33) to obtain a simple approximation 
for az; we discuss this possibility shortly.) 

Using the known limits for the true a = az as p > 1, (33) readily 
yields the following: 


Co(az) ~ p(1—prt’a’) ~ p (35) 
rA-0 A-—0 
(p—+0) (p>0) 
a 
Clie Pees a3 7. (36) 
Ama 1- gq & doa 
(p> 1) (p— 1) 


where we have used the fact that for A > 0 








ee 272 
q= | e*"dB(r) ~l1- ‘ + a . (37) 
, 0 a 2 
We similarly obtain from (34) 
_  Ci(az) Ar? r Ar? 
= ~ —— —); ~ —— 38 
ee Qe 0 2(1—- 9) az} »30 2(1 — p) 38) 
(p—0) (p—0) 
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and for \ — a, clearly 


Ci(a 1 
: aE Ama AE 
(p—1) 


Equation (38) shows that as \ — 0, we obtain the familiar P — K 
(Pollecek-Khinchin) formula for w in an M/G/1 system, while (39) 
shows that with 


= 1 1 
a= ae, WE OS os 
pl QE pl QH 


Now recall for the approximation A*, we take C* = Co(az) and 


_ Colar)ae 
- Ci(ae) 


a* 
Hence, for A*, the resulting exponent for the exponential approxima- 
tion is given by 

qaXaz(1 — q) 


Aa at qa)(ae + X= Gd) as 


From (40) we obtain 


2 1 
a®s a> = == (41) 
A-0 ar? R 
(p—0) 
ar(l — 
at ~ cel - jane | a ees (42) 
Aa Qa ha 
(p— 1) (p— 1) 


where R is mean forward recurrence time of the service time distri- 
bution. Note we have used the fact that 
Qgr™~ 0. 
p> 1 
The first of these is a very interesting relation. It is easy to show 
that for an M/G/1 system, that as p — 0 the true dominant root ag 
satisfies 


—> 


B(-ar) = i e““"dB(t) ~ (43) 
0 p 


Hence, for example, if B(s) is rational, ag tends to the smallest (in 
magnitude) pole of B(s), which is not consistent with (41). However, 
it is easy to show that for an M/G/1 system, the mean delay condi- 
tioned on being delayed, is just the mean forward recurrence time, R— 
the dominant contribution is from the case where the arrival finds one 
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customer in the system. That is, for p — 0, the dominant root really 
may not be the best quantity to use for an approximation, indeed, the 
limit (41) is most likely preferable. We, thus, have via A* an exponen- 
tial approximation to the true delay distribution for p — 0 which can 
be decidedly nonexponential but with the correct limiting probability 
of delay and mean delay. 

The approximation Ao, also has some very interesting properties for 
the M/G/1 case. Considering (7) and (9) as defining two equations for 
the two unknowns C and a, we obtain the two relations 


Cqd 


and 
Caan 
Combining (44) and (45) we find that 
A 
C=-=p. (46) 
a 


That is, for the M/G/1 case, Aoi yields an exact expression for the 
probability of delay, Pp. Using (44) or (45), we obtain the following 
approximation for the dominant root: 


2 
ap"q pa(1 — q)(1 — p) 
a = ————————. — ap = ———_————_- (47) 
(0 — (1-—q)) p— (1-4) 
Note that 
1l-g)(1l- 
_ a= (=p) a 
pl q 
but 
a-7--—= 
00 qq” 
as with A*. 


Note that since for an M/G/1 system, we know that Pp = p, we 
could have used this fact directly in either of the two approximations 
[ (33), (34)] to obtain an approximation for the true dominant root, az. 
This, of course, would result in 

me pa(1 — g)(1 — p) 
e-(l—q) ~ 


ie., just (47). 
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For the mean delay, w, Ao, yields the approximation 





= p—-(1—4q) 
Wo, =——_————_-——_, (49 
* ad = q)(1 = p) ! 
which has the asymptotic properties 
2 Ar? 
Di (50) 
ae 2{1—p- nae 
p p 9 
s q 
wo ~ ————————. 51 
* pra a1 — g)(1 = p) on 
(p—1) 


Comparing the asymptotic behavior of Ao; and A*, we see that while 
the former has a more desirable property for the approximation to Pp, 
the behavior of the approximations to W and az are more desirable for 
A* than for Ao. This leads us to consider another approximation, one 
that combines the best properties of each. The simplest method is to 
define approximation A** via wa-+(x) = 1 — C**e°***, where 





C** _ Cor 
and 
gts Cor = Code 
We. Cri 


The use of the heavy traffic ay, of course, yields similar results to 
those for Aro, Az:, and A* as p — 1. Again, there is no simplification 
for the Gaussian case. 


VI. SOME NUMERICAL EXAMPLES 


We consider several numerical examples of varying complexity to 
illustrate the accuracy of the various approximations. Because of the 
large number of possible combinations of approximations and quanti- 
ties of interest, we will not cover all possibilities for every example. 


6.1 Case: M/D/1 
For the M/D/1 case, 


1, u>T 


—\(7—u) =: 
» UST 


Kw) = Pi=7-Fsw={, 


Hence, we can readily obtain eq. (3) for the dominant root, az 





oo r 7 
aUdK a art — ; 
i ; e (u) Ey e 1 (52) 
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Rewriting this equation as 
At(e% — 1) = ar, (53) 


it is clear that in addition to a = 0, there is always one positive real 
root, az, for any finite A. Moreover, 


ag —> JO, ag > , 
AT=p—1 A—0 
(p—0) 


The other quantities needed for the various approximations are 











g = B(A) =e? (54) 
K(0)=q=e° (55) 
4 _ ga _ er 
Bale aet+AX arti 098) 
_ Ty dg): Ps _ Le? 
-=(7 s+2)-( 7+) oo 
_ 1 1 ae | 
i 
o = 2 (59) 
and, hence, 
—2u 2 = 
Qu = = aoe) (60) 
Qu T 


With az from (53) and (54) to (60), we can compute the various 
quantities predicted by our approximations. For simplicity, we will 
assume 7 = 1] in the following so that A = p. 


6.1.1 Probability of delay, Pp 


Table I shows the value of Pp as predicted by the various approxi- 
mations. Recall that Pp from Ao, is exact for this case. We see, 
however, that Azo(A*) produces a good approximation to this quantity. 
The value of Pp predicted by Ar; is seen to be significantly poorer, 
particularly for p small. Actually, our heavy traffic approximation Ax 
does significantly better than Ag. Note that Ago provides a reasonable 


Table |—Probability of Delay, Pp, case M/D/1 


p True Ao, (A* *) Axro(A*) Ano Ago Ag, ALr 
0.01 0.0100 0.0100 0.0100 0.0149 0.1920 0.0327 0.0100 
0.1 0.1000 0.1000 0.0975 0.1406 0.2256 0.1986 0.0952 
0.5 0.5000 0.5000 0.4756 0.5647 0.4462 0.6170 0.3935 
0.9 0.9000 0.9000 0.8864 0.8975 0.8524 0.9276 0.5934 
0.99 0.9900 0.9900 0.9884 0.9885 0.9842 0.9928 0.6284 
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Table Il—Mean delay, w, case M/D/1 


Ag (A - ? 

p True A**) Ao, Axo Am Ag Aur ALr 
0.01 0.00505 0.00505 0.00507 0.00154 0.00759 12.53 50.51 0.00503 
0.1 0.0556 0.0550 0.0565 0.0270 0.0832 1.589 5.556 0.0536 
0.5 0.5000 0.4911 0.5415 0.3785 0.7026 1.033 2.000 0.3513 
0.9 4.500 4.478 5.166 4,279 5.027 4.895 5.556 0.8379 
0.99 49.50 49.56 57.55 49.35 50.13 49.88 50.51 0.9829 


approximation in the heavy traffic region and that A_r provides a good 
approximation in the light traffic region. 


6.1.2 Mean delays, w 


Table II shows the mean delays resulting from the various approxi- 
mations. We see that Azi(A*) provides the best approximation to the 
true w. The approximations Ay and Ao, are somewhat similar, with 
An, being better at higher loads, while Ao; appears better in the 
midrange. The approximation Ago appears quite poor for low loads 
but improves with increasing load. While not comparable with Ayu, 
Ag, does provide some improvement over Aur. In the light load region, 
Atr is seen to provide a good approximation. It is interesting to note 
that while Ay, Aci, and Ayr all use the heavy traffic ay from (60), 
which has the poor behavior 


aQu— 0, 
p90 


this only results in the anomalous behavior 


Wa —> © 
p—0 


for Ag; and Aur. 
6.1.3 Dominant root, ae 


Table III shows the exponent that would be used in the (exponential) 
approximation for W(x) corresponding to each of the approximations. 
Recall that for Azo and Ag, the exponent is exact, i.e., a = az. We see 
that A* is somewhat better than Ao, for heavier loads and both tend 
to 2 as p — 0—the inverse of the mean forward recurrence time of the 
service time. As should be expected, the heavy traffic a, ay, is quite 
poor for light traffic. 


Table !II—Dominant root, a, M/D/1 


p True A* Ao Aun 
0.01 6.475 1.974 1.974 0.0198 
0.1 3.615 1.775 1.770 0.1800 
0.5 1.256 0.9685 * 0.9234 0.5000 
0.9 0.2071 0.1979 0.1742 0.1800 
0.99 0.0200 0.0199 0.0172 0.0198 
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Note that the differences between Pp and wW as predicted by Ago 
and Agi, particularly for light loads, indicates that the true delay 
distribution is essentially nonexponential. Yet, we see that assuming 
an exponential form can result in excellent approximations for Pp and 
w. Moreover, the reasonable results predicted by Ano and Ay; for light 
loads where ay is far from any reasonable choice of a dominant root 
indicates that these techniques are somewhat robust with respect to 
the choice of the exponent, a. 


6.2 Case: M/E2/1 


We now look at several of the resulting approximations for the 
waiting time distribution, W(x), and compare these with the exact 
results for an M/E>2/1 system. Specifically, we assume A(t) = 1 —- e~™’ 
and that B(t) is the convolution of two exponentials with unit mean. 
Hence, we have for the respective Laplace Stieltes transforms 


2 A 
saa eo 


For this system, the exact delay distribution, W(x) can readily be 
obtained via standard technique (e.g., see Ref. 10). The result is 


W(x) =1—- Ce** — Coe, (62) 
where 
(2—A) —[A(4+A)]'” 
Q = 
2 
(2 —A) + [A(4 + A)]’” 
ag = 
2 
aa 2 2 
C, = (1 — p)(1 — a) 
ai(a2 — a) 
C= —(1 — p)(1 — a2)’ 
ai(a2 — as) 


From (61) and (62), we can readily compute all of the quantities we 
need. We have 


a 1 1 1 
= a ra Ds a Oa 
q = B(A) G+" a: u=2 i 
and 
1 
Sie re ao 


yielding expressions analogous to (54) to (60). 
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Tables IV, V, and VI show comparisons of the exact delay distribu- 
tion, as well as the mean delay, with some of the approximations we 
developed for p = AT = 2A = 0.1, 0.5, and 0.9. The approximations for 
the probability of delay, Pp, and mean delay, w, show similar behavior 
to that for the M/D/1 case. In addition, we see that the approximation 
Ar, seems to have the best tail behavior (followed closely by A**). 

Thus, from these two examples, we might conclude that A** is the 
best overall approximation, while Ag, has slightly better tail behavior. 


6.3 Case: D/D2/1 (D/G/1) 


We briefly consider here the D/G/1 case and a simple numerical 
example which illustrates some interesting properties of our approxi- 


mations. 


For the D/G/1 case, we have 


x 


0 
0.5 
1.0 
5.0 
10.0 


Ww 


True 


0.9000 
0.9220 
0.9414 
0.9962 
0.9999 


0.1667 


Blu+d); u=-d 


Table IV—P(Delay = x) for case M/E2/1 


Azo 


0.9014 
0.9323 
0.9534 
0.9977 
0.9999 


0.1314 


Ar Ao A* A** 
0.8760 0.9000 0.9014 0.9000 
0.9148 0.9257 0.9268 0.9261 
0.9414 0.9448 0.9460 0.9454 
0.9971 0.9950 0.9950 0.9951 
0.9999 0.9997 0.9997 0.9998 
0.1654 0.1680 0.1654 0.1654 


Note: p = 0.1, True W(x) = 1 — 0.1667e~°"™ + 0.06667e7 17%, 


x 


0 
0.5 
1.0 
5.0 
10.0 


Ww 


Table V—P(Delay = x) for case M/E2/1 


True 


0.5000 
0.5644 
0.6272 
0.9084 
0.9848 


1.500 


Ago 


0.5119 
0.5922 
0.6593 
0.9192 
0.9866 


1.357 


Ag Ao. A* A** 
0.4666 0.5000 0.5119 0.5000 
0.5547 0.5742 0.5859 0.5776 
0.6277 0.6374 0.6488 0.6431 
0.9117 0.8998 0.9058 0.9073 
0.9854 0.9799 0.9818 0.9828 
1.483 1.556 1.483 1.483 


Note: p = 0.5, True W(x) = 1 — 0.5532e~°°* + 0,05317e7 1°, 


x 


0 
0.5 
1.0 
5.0 
10.0 


Ww 


Table VI—P(Delay <= x) for case M/E2/1 


True 


0.1000 
0.1244 
0.1509 
0.3497 
0.5359 


13.50 


Ago 


0.1057 
0.1354 
0.1641 
0.3617 
0.5446 


13.26 


Ag Ao, A* A** 
0.0919 0.1000 0.1057 0.1000 
0.1220 0.1278 0.1349 0.1296 
0.1511 0.1548 0.1632 0.1582 
0.3518 0.3426 0.3584 0.3557 
0.5374 0.5198 0.5397 0.5388 

13.46 14.33 13.46 13.46 


Note: p = 0.9, True W(x) = 1 — 0.911le~°%"** + 0.01110e7 "8", 
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where d is the constant interarrival time. Thus, 


K(0) = Bid) 


0 co 
K (a) = | e*~dK(y) =1- | e“dK(y) = 1—- K,(a) 
0 0 


ls = | xdK(x) = K4(0), 
0 
where a satisfies 
| e®dK(y) = R_(0) + Ko(0) = i. 


This class of systems (D/G/1) is particularly important in the 
analysis of a certain class of schedules for computer systems with real 
time applications. For example, see Refs. 1 and 2 where the approxi- 
mations given here are applied to the study of these schedules. The 
service time distribution for these systems are generally discrete in 
nature. Here we look at a simple special case with 


A(t) = U(t — d) 
B(r) = pi U(t — 81) + poU(t — 82), 


where U(x) is the unit step function, pi + pe = 1 and s; < d < s,. From 
these, we readily obtain 


K(0) =~ 
K_(a) = pre 
RK.(a) = poe 


ta = K',(0) = p2(s2— d), 
and ag satisfies 
R_(a) + K.(a) = &€(pie™ + poe*) = 1. 
Thus, (7) and (9) readily yield 


1 —?pi 
Pp = Co(aez) ~ T= pieeero (63) 


= C;(az) P2(s2 — d) 
Oe SS aad. 
QE Pi — pie 


(64) 


ap(s;—d)° 


For simplicity, we consider the special case s; = 0, s2 = 2d. In this 
case, the exact solution satisfies the difference equation 


P; = pi Pis1 + poPi-i, (65) 
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where P; = P(work in the system = id at an arbitrary arrival epoch), 
i.e, P; = P(arrival delayed id). Taking d = 1, the solution to (65) is 
readily found to be 











P;=P(Delay=i)=(1-nyrs r= ~ (66) 
1 
Hence, 
Pape (67) 
P1 
Fe a ee (68) 
1-r Pi — P2 
and ag = In(r7?). 
Now (63) and (64) yield 
1 races 
Ciaj=— = (69) 
1 —p2 pr 
C 
Wr = Cilaz) = pe , (70) 
aE Pi — Pe2 


i.e., Azo gives the exact value for Pp and Ag, gives the exact value w. 
In particular, this implies that the approximation Ago is exact at grid 
points— W,,,,(x) is exact for x = 1, an integer. 

Table VII shows the resulting delay distributions for several of our 
approximations. We see here that A* seems to be the best overall 
approximation. Note that Ao: (and hence A**) result in somewhat 
poorer approximations for this case. This may indicate that A* is 
potentially more robust. 

As another comparison for the mean delays predicted by our ap- 
proximations, we have included on the table (denoted by w[K — L]) 
the results of using an approximation for the mean delay given by 


Table Vil—P(Delay = /) for case D/D2/1 


L True Aro Ar A* Ao, 
0 0.1818 0.1818 0.0970 0.1818 0.1282 
1 0.3306 0.3306 0.2612 0.3178 0.2331 
2 0.4523 0.4523 0.3955 0.4312 0.3254 
3 0.5519 0.5519 0.5054 0.5258 0.4066 
4 0.6334 0.6334 0.5953 0.6046 0.4780 
5 0.7000 0.7000 0.6689 0.6704 0.5408 
6 0.7546 0.7546 0.7291 0.7252 0.5960 
7 0.7992 0.7992 0.7784 0.7709 0.6446 
8 0.8357 0.8357 0.8187 0.8089 0.6874 
9 0.8655 0.8655 0.8516 0.8407 0.7250 
10 0.8900 0.8900 0.8786 0.8672 0.7581 
Ww 4.50 4.08 4.50 4.50 6.80 


Note: p = 0.9, True P(Delay = 1) = E _ (3)| (2) , W(K — L) = 6.43. 
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Kramer and Largenbach-Belz.'' This approximation is essentially a 
heuristic extension of an approximation originally obtained by Hey- 
man” via diffusion techniques. The rather simple form of the approx- 
imation w(K — L) given below makes its use very appealing. 


tpe (C2 + C?) 


w(K — L) =———_—__,, 
( ) 2(1 — p) 
where 
2(1 — p)(1 — C?)? 
eee = OS I 
_ -8p(C2 + CF) 
2 
(1 — p)(Cz — 1) C= 


AC? + °C? 


and C,, C; are, respectively, the coefficients of variation of the service 
time and the interarrival time. (Note that w(K — L) is exact for an 
M/G/1 system.) 


6.4 Case: H2/E3/1 


As a last numerical example, we consider an H2/E3/1 system. Spe- 
cifically, 








“ is Pia Ded2 
BONS Ta cate 
Q 1 

B(s) “G+ roe 


Again, the exact waiting time is relatively easy to find via standard 
techniques (see Ref. 10), which yield 


W(x) =1— Cye™** — Cre~%* — Cre~2* 
where the aq; are the roots of the equation 
R(s) = A(—s) B(s) = 1, 


a denotes the complex conjugate of a, and the C;,’s are the correspond- 
ing residues which are readily obtainable (see Ref. 10). 
For the approximations of the preceding sections, we find that 


K(0) = piqi + pegs 


2 A r 
R (a) = Mga 4 P2q2h2 
Alita, Asta, 


316 THE BELL SYSTEM TECHNICAL JOURNAL, MARCH 1982 


Table Vill—P(Delay < x) for case H2/E3/1 


Xx True A* A** Ao, * Ao 

0 0.7834 0.7897 0.7835 0.7835 0.7835 

0.5 0.8121 0.8265 0.8225 0.8212 0.8080 

1 0.8406 0.8569 0.8544 0.8523 0.8377 

5 0.9731 0.9694 0.9702 0.9680 0.9740 
10 0.9977 0.9955 0.9959 0.9953 0.9978 
20 0.9999844 0.9999054 0.9999223 0.9998966 0.9999849 
Ww 0.5536 0.5458 0.5458 0.5662 0.5523 


Note: p = 0.2, True P(Delay < x) = 1.0 — 0.3260e~°*9** + e~!?!*[0,10944 cos(0.3129x) 
— 0.08673 sin(0.3129x)], Wa, (x) = 1 — 0.3161e~°4"** + 0,0995e~""**, w(K — L) = 0.5476. 


Table IX—P(Delay <= x) for case H2/E3/1 


x True A* A** Ao, Aoi 
0) 0.5728 0.5867 0.5730 0.5730 0.5727 
0.5 0.6145 0.6408 0.6305 0.6280 0.6104 
1 0.6578 0.6877 0.6803 0.6752 0.6550 
5 0.8997 0.8982 0.8996 0.8913 0.9008 
10 0.9806 0.9749 0.9764 0.9723 0.9808 
20 0.9993 0.9985 0.9987 0.9982 0.9993 
40 0.9999990 0.99999997 0.9999998 0.9969756 0.9999990 
Ww 1.497 1.475 1.475 1.561 1.494 


Note: p = 0.4, True P(Delay < x) = 1.0 — 0.5176e~°?7* + eo 1 -[0.09038 cos(0.3793x) 
— 0.06644 sin(0.3793x)], Wa, (x) = 1 — 0.5124e-°°?* + 0.0851e717*, w(K — L) = 1.488. 


Table X—P(Delay = x) for case H2/E3/1 


x True A* A** Ao, Ao 
0 0.1785 0.1906 0.1791 0.1791 0.1785 
0.5 0.2041 0.2253 0.2148 0.2122 0.2024 
1.0 0.2325 0.2586 0.2489 0.2440 0.2314 
5.0 0.4609 0.4779 0.4738 0.4562 0.4616 
10.0 0.6588 0.6632 0.6627 0.6398 0.6592 
50.0 0.9912 0.9899 0.9904 0.9866 0.9912 
100 0.9999093 0.9998741 0.9998874 0.9997827 0.9999094 
w 9.285 9.230 9.230 9.967 9.928 


Note: p = 0.8, True P(Delay < x) = 1.0 — 0.8517e~-°"* + e~'“?"[0.03026 cos(0.4548x) 
— 0.02125 sin(0.4548x)], Wa, (x) ;q¢ 1 — 0.8517-°°"™* + 0.0302e"'?*, w(K — L) = 9.266. 


where 


Tables VIII, IX, and X compare our various approximations for this 
case with p; = 0.25, A2 = 2A1 and p = 0.2, 0.4, and 0.8." (The approxi- 
mation denoted by A¢, will be discussed shortly.) Here we see again 
that A* and A** yield quite good approximations over a wide range. 

Again, for comparison, we have included the approximation w(K — 


t The roots needed for the exact solution (and the approximations) were obtained by 
using a program developed by A. E. Eckberg. 
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L). Unlike the D/D2/1 case, we see that w(K — L) provides a good 
approximation to the mean delay for these cases. 


Vil. EXTENSIONS 


As we have seen, even with the assumption of only a single expo- 
nential for the form of W(x), we are led to a wide variety of approxi- 
mations by using the Lindley integral equation (1) to determine the 
two unknown coefficients. We consider here some extensions for the 
case where we wish to choose a more complicated form for W(x). 


7.1 Method of moments for hyperexponentials 
In many cases, W(x) admits the expansion 


W(x) =1- y ce, (71) 
=1 


l 


where the coefficients C;, a; are not necessarily real. In such cases, an 
approximation of the form 


Wa(x) =1- ¥ Ce (72) 
i=l 
seems most appropriate. In general, of course, we do not know the 
form of W(x). However, if the equation 
K(-s) = A(s)B(-s) =1 (73) 


can be solved more than one nonzero root, then it seems reasonable to 
attempt to incorporate additional roots in an approximation of the 
form (72). One can discretize (1) to obtain a set of (implicit) equations 
for the needed C;; however, we show here that the method of moments 
introduced in Section III can readily be extended to obtain a set of 
linear equations for these quantities. For this purpose, it is convenient 
to consider the equation for W(x) = 1 — W(x) corresponding to (1), 
i.e., 


LLW°](x) = W°(x) — 1+ K(x) - | W(x — y)dK(y) =0. (74) 
We denote the Lth moment of (74) by pr: 
pL = | x"dL[ W°|(x). (75) 
0 


Using (74) in (75) we obtain 


Lo = | aW*(x) + | dK (x) + | W*(—y)dK(y) 
0 0 —oo 
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fo a} 


UL = | x’dW°(x) + i x’ dK (x) 


0 


+L i te | W(x — y)dK(y)dx, L>0. (76) 
0 00 


With the notation 


we = | x'dW(x), “= | x’ dK(x) 
0 0 - 
(76) can be written 
0 
Yo = W5 + Uo + | W°(—y)dK(y) 
wn = WE + Ur 
+ L| a W(x — y)dK(y)dx, L>0O. (77) 

0 —o 


[Note % = 1 — K(0) and u, = uw, of the preceding sections. ] 
Now using the form 


Wala) = Ce** (78) 


Lo = 0 leads to 
yy Ci(K; — 1) = —t, (79) 
where 
0 
K; = | e**dK(y). 


After some algebra, pn, = 0 for L > 0 yields 





LIK; Lazy. %L(L—-1)--- (L—R) _ L! 
x a( ea = ge ee 
= —-UuL (80) 


The empty sum is taken to be zero. 

Thus, given a set of roots a;, i = 1, --- , m of (73) eqs. (79) and (80) 
allow us to readily compute the desired coefficient C;. (Note that this 
method is identical to the common techniques of “method of moments” 
frequently used for other classes of integral equations [e.g., see Ref. 
13).] We refer to the resulting approximation as 


E 
Aoi atts m—l. 
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Table X1!—Dominant residue for case H2/E;/1 


p True Creo Cry Cor Cy (A 0,1) 
0.2 0.3260 0.2103 0.2714 0.2164 0.3161 
0.4 0.5176 0.4133 0.4841 0.4270 0.5124 
0.8 0.8517 0.8094 0.8444 0.8209 0.8517 


7.2 Numerical example (H2/E3/1) 


Note that if we apply the results of (71) to the H2/E3/1 example of 
Section VI and include all three roots, ai, a2, a3, then the resulting 
delay distribution will be exact [that is, if all roots are given, (80) is 
equivalent to the standard methods of determining the appropriate 
residues]. To see how we can use more structure to improve our 
approximations, we will use the true dominant root a; as one of our 
roots, but take a2 = Re(a2) as another root, the resulting approxima- 
tion is shown on Tables VIII, [X, and X, where it is denoted by Ao. 
We see that the inclusion of this additional term results in an approx- 
imation that more closely captures the structure of the true delay 
distribution. This is perhaps best illustrated by Table XI. Here we 
have given the values for the dominant residue, i.e., the coefficient of 
the exponential with the dominant exponent, for the various approxi- 
mations. We see that of the three single exponential approximations 
Ar, has a C value nearest the true value. This shows why Az, tends 
to have better tail behavior than the other single exponential approx- 
imation. However, we see that Ao, resulted in an excellent approxi- 
mation to the true dominant residue, even though we did not use 
another exact root. Hence, the excellent agreement on Tables VIII to 
X for the extreme tails of the distribution. This behavior can be very 
important in studying computer systems with dedicated real-time 
applications where often criteria are specified in the 10~° probability 
range, i.e., the probability of delay greater than T shall be less than 
10°. 


Vill. CONCLUSIONS 


The basic idea we have exploited is to choose a functional form for 
an approximation Wa,(x) to a true delay distribution, say, W(x), and 
use the well-known Lindley integral equation to find the undetermined 
coefficients. For the case where Wa(x) is exponential— Wa(x) = 1 — 
Ce~“—-we have used this technique to develop several approximations, 
some of which make use of the explicit structure of the relevant service 
and interarrival time distributions, while others require only moment 
information. 

Although not always the best choice, A* seems to provide the 
robustness that one would require of a good approximation. The 
resulting approximation for the mean delay is excellent and the result- 
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ing probability of delay quite good, although A** provides a better 
value for the latter. For predicting tails of distributions, we see that 
Ag, is the best of the simple exponential approximations. We have 
also seen that increasing the complexity of the forms of the waiting 
time distribution assumed in the approximation, e.g., using more than 
one exponential, can result in extremely accurate predictions of the 
tails of the delay distribution. 


IX. ACKNOWLEDGMENTS 


I am deeply indebted to A. E. Eckberg and D. L. Jagerman for many 
helpful discussions and suggestions. I would also like to thank D. F. 
DeMaio for her helpful comments on this paper. I am also very 
appreciative of the many useful observations and suggestions of the 
reviewers. 


APPENDIX 
Summary of Notations and Formulas 


¢ = interarrival time 
A(t) = interarrival time distribution 
L.S.T. = Laplace-Stieltes Transform 
A(s) = L.S.T. of A(t) 
E = Expected value 
t=E(t)=1/A 
T = service time 
B (7) = service time distribution 
B(s) = L.S.T. of B(r) 


T=E(7T) =1/a 
p=A/a 
u=7—-t 


K(u) = distribution function of a 
K(s) = L.S.T. of K(u) = A(—s)B(s) 
u=KH(u) =1f/a—1/A 
o% = variance of u 


0 
K_(s) = | e“dK(u) 


K,(s) = i e“dK(u) 
K(s) = R_(—s) + Kx(—s) 


Az = positive real root of i e*""dK(u) = 1 
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_ at ordered roots of characteristic equation, 
K(-—a) = A(s)B(-s) = 1(ai = ae) 


ii, = | xdK(x) = K’.(0) 
0 


K(0) = K_(0) 
ay = = 20/01 

p = A(a) 

q = B(A) 


W(x) = waiting time distribution 
Pp = true probability delay greater than zero 
u = true mean delay 
W(x) = 1— W(x) = complimentary waiting time distribution 
L[W°](x) = Lindley integral equation (complimentary) 


LLW°](x) = W(x) — 1+ K(x) - | W(x — y)dK(y) 


bt = | x”dL{ W°](x) 
0 
K; = K-(ai) 
Approximate waiting time distribution (single exponential): 
Wa(x) = 1—- Ce 
C = approximate probability of delay 


Wa = C/a = approximate mean delay. 


Approximate Ago: 
a= ar 
1-K(O 1— K_(0 
C= C(az) = _) = (0) 


1-Ki(az) 1- Raz) 
Approximation Ag: 
a=dr 
ey = Lilet) _ its _ K'.(0) 
"de K(0) — K-(az)_ K-(0) — K-(az) 
Approximation Apo): 
a = a, solution of 
og = LK) - K-@) IM — K(0)] _ [K-(0) ~ K-(@) If - K-)] 
u{1 — K-(a)] (0)[1 — K-(a)] 
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1— K(0 1— K_(0 
C=C = _1- KO) at). 
1- K—-(ao,1) t= K_(ao,1) 

Approximation A*: 


C = C* = Co(az) (from Azo) 





w* =e oe (from Azx,1) 
a QE 
‘ Co(az) 
det *= 
etermines a Clas) a 


Approximation A**: 


C = C** = Co, (from Ao) 








*K* C. 
w** = ote ate (from Ax) 
a 
C 
determines a** = C, — QE 


Approximation Azo: 


CS Or Sa 
Ou 
1—K(0) _ 1—K_(0) 


C = C = C a —_—_—_—_—"s—_— oo A . 
#0 = Colaiz) Li lan) T= RGa) 


Approximation Ay: 
a= aH 
Cui — Ci(ax) u+ +(0) 
WH) SS SS SS SS 


ayn Quy K(0)—K-(ay) RK-(0) — R_(an) 


Approximation Aur: 


a= aH 
= Cur 1 
QH Quy 
Approximation Ago: 
a= aH 
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Approximation Ag: 














a= aH 
ole) 4 #]1- Bf SE) 
= Ce Jn 2 20. 
We = =  .. 
Bxf({ = : =) 
V20, 


Approximation Axzr: 
Cir = 1— K(0) =1— K-(0) 
Cir its _ K4(0) 


wT = — = SF ESF 


apr K(0) K_(0)' 


Approximate waiting time distribution (multiple exponentials): 


Wa(x) =1— Yo Creo. 


i=l 
Approximation A&@),...,m: 
a; roots of K(—a) = 1 
ai = ai, t=1,-+--,m 


C; determined from 








ey Lipa *L(L—-1) +--+ (L—- k) = L} - 
i=1 E aj k=1 ai ai 


where the empty sum is taken to be zero. 
Approximation A9,:: 

a; roots of K(-s) = A(s)B(—s) =1 

ai =a=a5 

as = Re(az) 

C; determined from 

(1 — Ki)C, + (1 — Ke)C2 = U = 1 —- K(0) 


OE Ga OA aa: 


a, as 
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A General Class of Zero- or Minimum-Delay 
Fractional Rate Change Circuits 


By S. V. AHAMED 
(Manuscript received July 14, 1981) 


Rate changing occurs whenever sequences of data undergo trans- 
formations in rate without undergoing a change in the order of 
sequence. When the ratio of transformation is not an integer, frac- 
tional rate changes are necessary. These are generally, a prerequisite 
for the time-compression multiplexing mode of data transmission. 
Zero or minimal delay is a desirable characteristic, for example, in 
reducing the annoyance from the far-end echo whenever voice is 
encoded and transmitted. Conventional fractional rate changing 
entails an inherent delay in the rate change circuits. Segmenting 
shift registers reduces the delay of the last bit without completely 
eliminating it, unless the shift-register length is reduced to one bit. In 
this paper, a method of partitioning the shift registers by logarithmic 
counts is developed to reduce the complexity of the gating and the 
counting circuits. Zero last-bit delays are attainable in all cases 
where the rate increase is greater than two or, conversely, the rate 
reduction is less than half. For the remaining cases, the compromise 
between circuit complexity and the last-bit delay is outlined. 


1. INTRODUCTION 


Under the time-compression mode of data transmission, the round 
trip delay time is critically important in controlling the echo from the 
far end, as well as the direct transmission delay. The excess delay of 
the last bit, in changing the rate from the primary (or terminal) rate 
to the secondary [or time-compression multiplexing (TCM)] rate and 
vice versa, which can be a significant fraction of the overall delay, in 
any particular block is reduced to zero by the circuits presented in this 
paper. In general, it is shown that the last-bit delay can be reduced to 
zero whenever the required rate change is more than two, while 
increasing the number of shift registers (SRs) and the gating functions 
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logarithmically. Further, the gating signals are shown to be derived as 
combinations of logarithmic counts of the primary (terminal) clock or 
those of the secondary clock. The general principle of fractional rate 
changing is extended to fractional rate collating circuits. The compro- 
mise between zero last-bit delay and the complexity of the circuit for 
fractional rate changing between half and one is delineated. 

In 1971, a general class of rate change circuits was presented’ for use 
in the magnetic domain technology. The constraint on the design of 
such circuits was that all the individual bits of information (domains) 
be propagated by one period (the physical spacing between one pattern 
to the next in domain technology) in one clock cycle of the rotating 
magnetic field. To conform to this design requirement in domain 
technology, the number of patterns in different paths of the rate 
change circuit were arranged to follow a geometric progression. The 
topological arrangement of sRs and gates, if operated in conventional 
semiconductor technology, will perform satisfactorily. However, it 
imposes two unnecessary restrictions: (z) this version in semiconductor 
technology ignores the capability of srs to shift-in (s1) at one rate and 
shift-out (so) at another rate, and (ii) fractional rate change is a two- 
step procedure, thus, demanding a large number of sR locations. The 
essential feature retained in these circuits is that the delay between 
the reception of the last bit of block of data and its transmission would 
be zero. Zero last-bit delay fractional rate change is not possible with 
a conventional arrangement” of sRs, unless the number of independent 
registers is increased to the number of bits in the block, thus, demand- 
ing extremely complex arrangements of gating and shifting functions. 

In this paper, we present circuits that retain the zero last-bit delay 
characteristics and the simplicity of the gating function without lin- 
early increasing the number of srs, or the complexity of gating and 
shifting functions. Even though most of the emphasis is placed on 
fractional rate changing, integral rate changing is equally easily accom- 
plished by the circuit arrangements presented here. 


Il. ZERO-DELAY FRACTIONAL RATE INCREASING 

For the cases presented in this section, let the proportional rate 
increase be greater than two. Cases where the rate change is between 
one and two are discussed in Section IV. 
2.1 Fractional rate between two and three 


Consider a block 42 bits long where the rate change ratio is 3:7. 
Forty-two, being a multiple of 3 and 7, generates a situation where one 


* Conventional arrangement consists of having one or more SRs in which data are 
sequentially shifted in at the incoming rate and shifted out at the output rate. 
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block of data 42 bits long experiences the same set of gating functions* 
that was performed for the last 42-bit data block or the next 42-bit 
data block. Next, consider an arrangement of 7 srs shown in Fig. 1. 
Data are received uniformly one bit ever ¢ seconds. Data are generated 
in a burst of 42 bits. Burst repeats every 42¢ seconds and lasts for 18¢ 
seconds. Effective rate change is 3:7 during the burst. Here, the srs 
follow a sequence 2°, 2', 2”, 23, 2*, (42 — 2°). The signals C and C’ are 
clock signals at ¢ and t’, where ¢ denotes the primary (terminal) clock 
and ¢’ denotes the secondary clock durations in seconds. The signals 
dg, a5, —, @1, ao are generated every 42¢ seconds and last for (42 — 2°), 
2‘, —, —, 2°, 1 t seconds, etc. Similarly, bs through bo are generated by 
identical circuits operated at the secondary clock t’ but delayed for 
the 24t of the 42¢ seconds cycle time. The operation of the SRs is 
summarized in Table I. 

It can be seen that all srs have a positive duration between the 
finish of the sI and start of so (see column 6, Table I). As proved in 
Appendix A, the Ath sr experiences a delay of: 


dy, = 2? t — 2*t’ seconds. . (1) 


If ¢ is defined as >2t’, d; is always positive. In the example presented 
here, the delays for sks 1 to 5 can be equated to d; in (1) by substituting 
t’ = %t both in (1) and in the sixth column of Table I. The delay of the 
42nd bit is zero since the ending instants of received and the transmit- 
ted bits coincide at the end of the block. 

The four distinct characteristics of the circuit configuration pre- 
sented in this section can, thus, be summarized as follows: 

(1) The si-so duration is always positive for all srs. 

(ti) The gating signals ag through ap and b¢ through bo are generated 
simply by identical binary (in cases where the fractional rate change 
ratio r > 2 and < 3) counters each being driven by a primary (terminal) 
clock and by a secondary clock, respectively. 

(iti) The delay in the circuit as measured by the difference between 
the end of arrival of the last data bit and the end of transmission of 
that same data bit at the higher clock is zero.! 

(tv) The number of sR locations required is always the bit length of 
the block reduced by one. 


2.2 Fractional rate increasing between three and four 


In Section 2.1, the number of locations in the sR progresses 
as 2°, 2) ..., 2' (N — 2'*'), depending on the size of the block (N). 


* Systems considerations usually require such a repetitive set of functions for contin- 
uous operations. 

t It is possible to advance the operation by an extra (¢ — t’) second by advancing t’ by 
this period and loading the last bit into an sr. But, we have ignored this situation as it 
needs only a trivial modification of the circuits shown here. 
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bo 


DATA 





OR GATE 


SR6 = 10 BITS 


Fig. 1—Shift register arrangement for a 42-bit data block. 
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Table |—Details of 42-bit block—3:7 rate increasing circuit 




















Se Start of Finish Startof End of oe On Time 
si(a) of st (6) so (c) so (d) (ie By 7 a3 
SR6 0 10¢ A=24t A+ 10t’ 14t 10¢ 10t’ 
SR5 10¢ 26¢ A+10t’ A+ 26t’ 10t’ — 2¢ 16¢ 16¢’ 
SR4 26t 34t A+ 26t’ A+ 34t’ 26t’ — 10t 8t 8t’ 
SR3 34t 38t A+ 34t’ A+ 38t’ 34t’ — 14t At At’ 
SR2 38t 40t A+ 38t’ A+ 40t’ 38t’ — 16t 2t 2t’ 
SR1 40t 4lt A+40t? A+4l1t’ 40t’ —17t t t’ 
Signal Start Finish Signal © Start —-—*Finish Bee Bs 
ao 4lt 42t bo A+41t? A+ 42t’ Zero 


* This interval can be always verified as being positive at t = 7/3 X t’. 

Notes: 

1. Shift-in takes place during one clock cycle at ¢. 

2. Shift-out takes place during one clock cycle at t’. For maintaining zero delay, it is 
assumed that the end of the 42nd clock cycle at ¢ coincides with the end of the 98th 
cycle at t’. This condition ascertains that ao and bo end simultaneously at the end of 42¢ 
or 98’. 


When the fractional rate change ratio r is between 3 and 4, all the 
essential characteristics of this class of rate change circuit may be 
maintained by changing the numbered locations in the srs to follow 
the progression 


3°, 31, 37, «++, 3, (w- 1- > ) 
r=0 
Consider a rate change ratio of 7:23 with a 161-bit block. The sr 
progression becomes a 1-, 3-, 9-, 27-, 81-, 39-bit position constituting 
160 locations. The six SR circuit is shown in Fig. 2, and the shifting 
times are tabulated in Table II. 
The generalization of this arrangement for an N-bit block with any 
fractional rate change r is presented in Appendix B. 


lll, FRACTIONAL RATE DECREASING CIRCUITS 


When the decrease in rate equals or becomes less than 0.5, we have 
a situation where the circuit arrangement presented in Section II can 
be used to decrease the rate. Two changes however, are necessary: (Z) 
the direct transmission of data between ap and 0p (Figs. 1 or 2 have to 
be replaced by a flip-flop to load and hold the first data bit and, (iz) 
the signals a, ---, a, and bo, ---, b, must start at the beginning of 
the block instead of ending at the end of the block. 

Consider a data block 680 bits long where the desired rate change 
ratio is 0.425. The minimum size of the data block is 680, being the 
least common multiple of 17 and 40, even though equally satisfactory 
arrangements can be conceived for all blocks that are multiples of 680 
bits long. Once again the sr sizes (Fig. 3) can be written down as 1, 2°, 
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bo 


40 


DATA 
IN 





OR GATE 


Fig. 2—Shift register arrangement for a 161-bit block. Rate change ratio is 7:23. 
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Table ll—Details of 161-bit block—7:23 rate increasing circuit 





= Start of | Finish Start of | Finish of ee On Time 
si (a) of st (6) So (c) so (d) Ge bye 31 36 
SR6 0 39t A=112t A+ 39t’ 73t 39t 39t’ 
SR5 39t 120¢ A + 39t’ A + 120t’ 39t’ — 8t 8lt 81t’ 
sR4 120¢ 147t A+ 120t’ A+ 147t’ 120t’ — 35t 27t 27t’ 
SR3 147¢t 156¢ A+147t’ A+156t’ 1470’ — 44t 9t 9t’ 
SR2 156¢ 159t A+156t’ A+159t’ 156t’ — 47¢ 3t 3t’ 
SRl 159t 160¢ A + 159’ A+ 160t’ 159t’ — 48¢ t t’ 
Signal Start Finish Signal Start Finish Finish bo 
Finish ao 
ao 160¢ 161¢ bo A 160¢ A+ 161t’ Zero 


* This number can be verified as always being positive when ¢ = 23/7 x t’. 


Q?, 23, 24, 2°, 2° 27, 2°, (680 — 2°). The clocks C and C’ are in the 
proportion of 17:40. The signals ao, ai +--+ ag, dg and bo, bi «++ bs, bo 
are generated for (1, 2°, 2’, ---, 2°, 168)¢ and (1, 2°, 2’, ---, 2°, 168)¢’ 
seconds,* starting at the beginning of the block. The starting instants 
of these signals are (0, 1, 2°, 2’, --- , 2°)¢ seconds and (0, 1, 2°, 2’, ---, 
2°)t’ seconds from the starting position of the block for ao, a1, ---, ag 
and bo, bi, --- , bs, respectively. The circuit arrangement is presented 
in Fig. 3, and the gating times are tabulated in Table III. The principle 
may be extended to an N-bit block whose effective rate reduction is 7, 
and a circuit configuration similar to that shown in Fig. 6 may be 
derived. / 


IV. FRACTIONAL RATE CHANGES BETWEEN ONE AND TWO 


A zero-delay circuit configuration for these rate changes may also 
be derived by extending the general principles of Sections II and II. 
However, the number of srs increases (1’ = 1) to the number of bits in 
the block, and the circuit configuration becomes trivial. On the other 
hand, it is possible to compromise the zero delay requirement slightly 
and obtain the advantages of reducing the number of sks and simpli- 
fying the gating functions. These considerations are discussed next. 


4.1 Rate increasing ratios between one and two 


Consider a data block 84 bits long. If the desired rate increase is 4:7, 
one can arrange a circuit shown in Fig. 4a that is consistent with the 
arrangement shown in Fig. 1. However, if SR7 is emptied after a delay 
of 36t* seconds, then sR6 would not have completely shifted in before 
it becomes necessary to shift it out, and it is here that the compromise 
becomes necessary. If the shifting out sequence is now delayed by a 


* Durations ¢ and ¢t’ are cycle times at faster (TcM) and slower (terminal) clocks, 
respectively. 
t The duration ¢ is a clock cycle time at the terminal (primary) rate and 36 = 84(1-44). 
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SR1 = 2° BIT 


OR GATE 


SR = 168 BITS 


Fig. 3—Rate-reducing circuit for a 680-bit data block. Rate change ratio is 17:40. 
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Table !I—Details of 680-bit block—40:17 rate decreasing circuit 


Sn Start of Finish of Start of a SI-SO On Time 

SI (a) si (b) so (c) (d) Interval ai aa 
SRo 0 0* 0* t’ 0 0* t’ 
SR, t 2t t’ 2t’ t'-t t t’ 
SRo 2xt 4xt 2t’ 4t’ 2(t’ — t) 2t 2t’ 
SRi 2" x ¢ eee f oy 2't’ 2'"(t’ — t) oF vam 
SRg 2’xt OF sé if i a 27(t’ — t) 2"t 2"t’ 
SRo 2x t 680 yg 680t’ 2°(t’ — t) 168¢ 168t’ 


* Indicates shift-in time of SRO. This is <¢ or t’ and can be used effectively to mark 
the beginning and end of a 680-bit data block. 


minimal amount to just accommodate the binding requirement on this 

sR, then it can be seen that the delay should be 4.571¢* seconds. 
Further, this minimal delay depends on the length of the block, the 
fractional rate change ratio r, and the sR position in the hierarchy of 
registers. The generalized calculations for the delays are presented in 
Appendix C. 

For the particular case presented in Fig. 4 for the 84-bit block, it is 
now possible to compute the gating times and the delays for the overall 
functioning of the circuit. These results are tabulated in Table IV, and 
Fig. 4b depicts the timings. The minimum delay of 4.571¢ seconds 
plays a central role in the operation of the circuit. This delay being 
essential for the functioning of SR6 appropriately, gets fragmented into 
2.286¢, 1.1432, 0.5712, 0.2862, 0.143¢, 0.071¢ and, finally, 0.071¢ for sr6 
through SRO to add up to the net delay of 4.571¢ seconds for the circuit. 
Once this delay is generated by an additional circuitry, the functioning 
of the gates at the higher clock rate can be normal binary counters, 
delayed by this appropriate amount. This feature eliminates the need 
for complicated gating circuits, even when the fractional rate increasing 
is less than two. 


4.2 Compromise between circuit complexity and minimum delay 


It can be seen that the entire delay of 4.571¢ seconds can be cut into 
half by partitioning sR6 into two sixteen-bit registers. In this case, SR7 
is delayed by 2.286¢ and the si finish coincides with the so start for 
sR5, and the delay reappears as the summation 1.143¢, 0.571¢, 0.2862, 
0.14382, 0.071¢, and 0.071¢ for the remaining srs at the end of the block. 
If one extends this reasoning indefinitely, then all the sRs become 
partitioned to single-bit registers for zero delay. It is at this stage that 
the compromise between sR complexity and minimum delay becomes 
obvious. All the same, these circuits outperform conventional circuit 


* 4.571 = (20 + 32) — (36 + 20 x 4%). 


RATE CHANGE CIRCUITS 335 


DATA 


SR3 = 4 BITS 


-------— 
Ls 





OR GATE 


SR7 = 20 BITS 


Fig. 4a—Rate-increasing circuit for an 84-bit block. Rate change ratio is 4:7. 
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Table !V—Details of 84-bit block—4:7 rate increasing circuit 





Baad SI-SO On Time 
Start Finish Start of Finish of Interval SSS 

SR ofsi of sI so (c) so (d) me 

(a) (b) € eas sI so 
SR7 0 20t A = 40.571t A + 20t’ 20.571¢ 20t 20t’ 
SR6 20t 52t A + 20t’ A + 52t’ 0 32t 32t’ 
SRD5 52t 68t A+ 52t’ A + 68t’ 2.285t 16t 16t’ 
SR4 68t 76t A + 68t’ A+ 76t’ 3.429¢ 8t 8t’ 
SR3 76t 80t A + 76t’ A + 80t’ 4,000t 4t 4t’ 
SR2 80t 82t A + 80?’ A + 82t’ 4.286t 2t 2t’ 
SR1 82t 83t A + 82t’ A + 83t’ 4.429t t t’ 
SRO 83t 84t A + 83t’ A + 84t’ 4.571t* t t’ 


* Minimum delay = 4.571¢ and A = (36 + 4.571)t. In the case of sRO, (c = d — b) to 
prove that the end of the block at the faster rate ends as late as the minimum delay 
imposed upon shifting the first data bit out of sR7. (See Fig. 4a.) 


—20t SECONDS - ——-—— 32t SECONDS ———-— 16t SECONDS etc. 


0 20t 52t 68t 76t 80r 84t 
—START EMPTYING SR7 


y| MINIMUM DELAY = 4.571 t= D 
SR7 


: 36 A=40571¢ START EMPTYING SRE 
FINISH LOADING SR6 ~ FINISH 
SR6 7 ~ EMPTYING SR6 


FINISH LOADING SRO 





0 START EMPTYING SRO-— 
4 

FINISH EMPTYING SRO—~ 
ee 


(b) D=4571t 


Fig. 4b5-—Timing diagram for an 84-bit data block in a 4:7 rate-increasing circuit 
configuration. 


arrangements because the delay time of 7 sRs, each 12 bits long, would 
be a 12¢ second compared to 4.571¢ for the arrangement suggested here 
with 8 srs, or of 2.286¢ with 9 srs. The extent of gain depends on the 
length of the block and the rate change ratio. The circuit complexity 
has to be chosen by a designer depending on the particular require- 
ments. 


V. FRACTIONAL RATE COLLATING CIRCUITS 


To illustrate the example of fractional rate collating, consider 17 
independent data lines each carrying independent data uniformly at a 
primary rate of one bit every ¢ seconds. The 18th line also carries 
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binary data but a tertiary rate of one bit every ¢, seconds. Let the block 
time be T seconds, corresponding to 7 bits at the primary rate ¢, and 
4 bits at the tertiary rate ¢, (1.e., 7t = 4t,). The collating circuit will be 
able to present the first bit of the first channel followed by the first bit 
of the second channel etc., until the 17th channel, then the second bit 
of the first channel, followed by the second bit of the second channel, 
etc., and so on, until the 7th bit of the 17th channel. The final section 
of the output data block would be the four bits of the 18th channel. 
The output data rate will be at a uniform secondary bit rate of t’ (i.e., 
T/123). 

A collating circuit to perform this function is presented in Fig. 5a. 
Data arrive uniformly every ¢ seconds on 17 channels, C,, ---, Ciz, 
and ¢, seconds on channel Cig with 7t = 4¢t, data are transformed 
to 123-bit blocks. The first bit of C, is collated with first bit of C2, 
etc., until Cy; the second bit of C, is collated with the second bit of 
Cz, +++, etc., and the last four bits of the block are bits received on 
Cig. The output rate is t’ with 123¢’ = 7t = 4t, = T. The first bits of all 
the 17 channels are collated into sR1, the second bits of all the 17 
channels are collated into sR2, etc. The 17 blocks of si signals m, --- , 
M173 Mg, +++, Ms, +++, etC., +++, Mig are generated by ANDing 
signals ¢, 2¢, 3t,---, 7t, with t’—the signal ¢ being long enough to 
accommodate 17 full cycles of t’. The so signals m, «++, M173 Mis, -*+, 
Ng} +++, etC., +++, Mie3, are generated by delaying the secondary block ¢’ 
by 21 (i.e., 17 + 4) full cycles of the secondary clock t’. The sI signals 
are in two classes: m1, «++, Mig are generated as indicated earlier, 
and mi20, «++ , M23 are generated at the tertiary rate of one signal ev- 
ery t; seconds. The signal timings are shown in Fig. 5a. Shift-in sig- 
nals m, +++, Mig are generated by ANDing ¢ with 17 of t’. Signals 
M20, +++, Mi23 are generated by ANDing ¢ with one of t’. Shift-out 
signals m1, -++, Mi23 are generated at ¢’ after delaying clock by 21¢t’ 
from the start of the block. It is interesting to note that Srl has a 
delay time of 4t’ seconds between si and so. However, this time 
gradually diminishes by “t’ as one progresses through SR2 to SR7, with 
Sk7 itself shifting out as soon as it is shifted in. (See Fig. 5b.) It is this 
limit that specifies the minimum delay of the collating circuits. 

Other collating circuits can be generated by the principles indicated, 
and the generalization of such circuits is also possible but omitted 
here. 


VI. DISCUSSION OF RATE CHANGING AND COLLATING CIRCUITS 


Conventional sr arrangements lead to a delay in the transmission of 
data, unless the size of the register diminishes to one. For circuits 
described here, there is a geometric progression in the sizes of the 
registers and in the logarithmic number of registers. The base of the 
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Fig. 5a—Fractional rate collating circuit for a 123-bit data block. 
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Fig. 5b;-—Timing diagram for circuit shown in Fig. 5a. 


geometric series is chosen to be an integer just under the fractional 
rate change ratio. The time for shifting out any particular register in 
the series is made to overlap with the time for shifting in of the next 
lower-sized register of the series. Two definite advantages arise by this 
arrangement of SRs: (4) the number of registers can be reduced expo- 
nentially (see Appendices A through C) for a given delay, and (ii) the 
gating can be performed very simply by counters that block out 
exponentially varying sizes of time at the primary or secondary clocks. 
These features are absent in the conventional arrangement of registers. 

When the fractional rate change falls between one and two, the 
arrangement suggested still outperforms conventional arrangement as 
far as delay is concerned for a given number of srs. In a limit, the two 
systems converge to a minimum delay with one-bit registers. The 
compromise between circuit complexity and minimum delay is offered 
to the designer. 

Collating circuits also retain some features of minimizing the delay. 
Consider the circuit presented in Section V. Here, conventional ar- 
rangements would have two extremes of implementation. First, con- 
sider 17-seven and 1-four bit srs, all shifting in and then shifting out. 
The delay is obviously 123¢’ since all the registers have to be shifted 
in before shifting out. Second, consider 123 one-bit registers. With an 
immense amount of complexity in gating and clocking, one-bit delay 
may be achieved. The first arrangement does not compare favorably, 
from the delay consideration, with the configuration of the 18 regis- 
ters—the delay being 123¢’ versus 21t’. The second arrangement offers 
an advantage in delay time (¢’ versus 21t’) but at an enormous 
complexity (123 timing circuits and sks versus 18 such circuits). As 
indicated in Section 4.2, a compromise between circuit complexity and 
minimum delay also exists for collating circuits. 


Vil. CONCLUSIONS 


Fractional rate changing may be achieved with zero last-bit delay 
without increasing the number of sRs to the number in the bits in the 
block. In the circuit arrangement presented, the number of sRs in- 
creases logarithmically. The base for the logarithmic increase is the 
highest integer just below the rate change ratio. When the rate change 
ratio is between one and two, the simplicity of the sR arrangement and 
gating may be retained by accepting a slight last-bit delay in the 
circuit. Such an arrangement, even though not reducing the last-bit 
delay to zero, still outperforms a conventional arrangement by a large 
margin. The exact compromise between complexity and delay is left to 
the circuit designer. 

Fractional rate collating of data blocks have some of the advantages 
so far as the simplicity of SR arrangement and gating is concerned. 
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APPENDIX A 
Shift Register Delays 


Consider an N (>2/7' and <2’) bit block in which a rate change of 
r (22 and <3) is being accomplished. The arrangement of a circuit 
with 7 sRs consists of 2°, 2), .--, 27-?, (N — 2/"') bits each. The size of 
the kth (k <j — 1) sR being 2” finishes at an instant of: 
i=k-2 
d; = (v == » a seconds.* (2) 
1=0 
Similarly, the start of the so of the kth sr does not take place until an 
instant 
i=k-1 ; 
d= N(t—-t’') + (w asl =-5) xe seconds. (3) 
i=0 
The duration between the instant at which So starts and the instant at 
which sI ends can be obtained by subtracting (2) from (3) leading to 


d, = 2" "t — 2*t’ seconds, (4) 
since 
i=k-1 
Si y, 22%: (5) 
i=0 


For this Appendix, by definition’ we have t = 2t’ and d; always remains 
positive. 


APPENDIX B 
Generalized Fractional Rate Increasing Circuit 


Let the rate of an N-bit block be increased by a fractional rate of r. 
The general configuration of the circuit is shown in Fig. 6. The number 
J of srs is chosen to satisfy the condition that: 


gab. J, 
Yri<N-1s3¥ ri, (6) 
0 0 
where r; is an integer just less than r. The first 


N-1-¥ rj" 


* In eq. (2), ¢ corresponds to the duration of a clock cycle at the terminal rate and t’ 
corresponds to the duration at the burst rate. The origin of time is chosen to coincide 
with the start of the block. 


t The fractional change in rate is =2. 


342 THE BELL SYSTEM TECHNICAL JOURNAL, MARCH 1982 


7) 





OR GATE 


SRi+1=N-1-D// 


7/=0 


Fig. 6—Generalized fractional rate change circuit. 
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bits of the block are channeled into sR;+1 and the rest r4 bits into sR,, 
etc. Now consider the kth sr, ri’ bits long. The ending instant of s1 
for this register takes place at: 
i=h—-2 
d; = (~ -1- ¥ ri) seconds, (7) 
i=0 
where ¢ denotes the duration at the primary clock and the start of the 
so time takes place at an instant: 
i=k-1 
d. = N(t — t’) + (w -l1 ¥ rye seconds. (8) 
i=0 
The duration between the instant at which the so starts and the 
instant at which the sI ends is given by 
i=k—2 ; i=k-1 ; 
dy=t—t'+ Y rit— ¥ rit’ 


i=0 i=0 


k-2 ; i=k-1 ; 
= (: +y¥ rye (: + ¥ rs)e (9) 
i=0 i=0 


In the limiting case where ¢ = rt’, we have: 
do = rit’ — 2t’. (10) 


The value of d,; becomes zero when r; = 2 and eq. (10) becomes the 
same as eq. (4) in its limiting case. However, d,, always remains equal 
to or greater than zero, thus, validating the basic law of all srs—that 
they may not start to so, unless they are completely shifted in. 


APPENDIX C 
Delay Requirement on Fractional Rate Increasing Between One and Two 


Consider an N-bit block with a rate change ratio of r (<2 and >1). 
Let r’ be the reciprocal of r. Then it can be seen that an arrangement 
of srs shown in Fig. 7, and zero delay leads to a situation where the 
jth register would not have been shifted in before it is ready to be 
shifted out. To avoid this and to assure that so takes place immediately 
after shifting in, the delay necessary can be calculated as 


din = [(N — 2"**) + 2 —- N(1 —r’) — (N — 2**)r’]t seconds 
= 2*(2r’ — 1)t seconds. (11) 


The first two terms in (11) denotes the time required to fill the last 
two sRs with (N — 2**') and 2" register positions at the primary rate of 
t seconds per clock cycle. The third term denotes the length of the 
block at a difference of clock rates between the primary clock rate of 
t second and a secondary clock of t’ (= r’t) seconds. The fourth term 
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Fig. 7—Generalized fractional rate change circuit for ratios between 1 and 2. 
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represents the time to empty the last register at the secondary rate. It 
can be seen that in the limiting case when r’ tends to 0.5 the delay 
necessarily tends to become zero. 

The physical significance of this delay is to adjust the coincidence of 
the finish of the sI instant with the start of the so of the last but one 
SR, thus, finishing the binary sequences of register locations. It is also 
interesting to note that the time delayed in emptying the first sr 
reappears at the end of the block as a summation of delays in the 
various other registers before the last register. For instance, the delay 
between shifting in and shifting out of the (7 + 1)st; (0<i<k) sR can 
be written as: 


j=k : J=k-1 ; 
di) = [ Se 2 seconds. (12) 
i= jri-1 
The delay for SR1 becomes 
= [r’ x 2(2" — 1) — (2" — 1)]ét seconds. (13) 


The delay for sRO causes an incremental increase in delay of (27’ — 1)t 
seconds. Hence, the total delay of the last bit coming out of SRO is: 


= (r’ x 2"*1 — 2*)t seconds, (14) 
which is consistent with eq. (11). 
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Peak Signal-to-Noise Ratio Formulas for 
Multistage Delta Modulation With RC-Shaped 
Gaussian Input Signals 
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A multistage delta modulation (MSDM) encoder contains a number 
of delta modulation (DM) stages, where each delta modulator encodes 
the band-limited error of the previous delta modulator. The DM 
binary outputs are then multiplexed for transmission. By this tech- 
nique, substantial gains in s/n compared to a single-stage DM can be 
achieved at high transmitted bit rate to message bandwidth ratios 
(f>/f-). For Gaussian input signals having band-limited resistance- 
capacitance (RC) spectra, the peak s/n performance of MSDM as a 
function of (f,/f-) and the number of DM stages is presented. It is 
shown that like PCM, MSDM exchanges s/n with (f,/f-) on an exponen- 
tial basis. 


1. INTRODUCTION 


Delta modulation (pM) has been extensively studied,’® and following 
its integration onto a chip,” it is being increasingly used in industrial 
applications. The salient advantages of DM are robustness to transmis- 
sion errors; tolerance to clock jitter; simple filtering requirements; 
suitability for encryption; and low complexity resulting in inexpensive 
implementation. In typical applications, the ratio f,/f- is <10, where f, 
is the transmitted bit rate and f, is the bandwidth of the message 
signal. However, DM does not efficiently improve its s/n with increasing 
f>/fe, particularly when compared to pulse code modulation (pcm). As 
a consequence, DM is rarely used to encode high-quality audio signals 
because of the excessive f,/f- ratios required. 

In pM, the quantization noise is dependent on the error e(¢) between 
the input signal x(¢) and a locally reconstructed version y(t) (formed 
by locally decoding the transmitted bit stream). The y(¢) signal essen- 
tially tracks x(t), and the polarity of the transmitted bit is identical to 
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the polarity of the tracking error e(f) at any sampling instant. To 
increase s/n by increasing f,, requires e(t) to decrease. It might be 
supposed that the reduction in e(t) would be enhanced if adaptive 
delta modulation’ (ADM) is used rather than linear pM.’ In apM, the 
changes in y(¢) per clock period, i.e., the step sizes, are not constant as 
in linear DM. Various step-size algorithms have been used in. ADM, but 
all occasionally produce inappropriate step-sizes resulting in “over- 
shoot-noise.”?° In fact, ADM is generally not used to increase peak s/n 
but, rather, to greatly extend the dynamic range of nonadaptive DM. 
However, with care the peak s/n can be enhanced at high clock rates, 
but not by significant amounts.” 

The basic problem with any form of DM is that the encoder generates 
information which is only dependent on the polarity of the error. No 
description of the magnitude of the error is available at the receiver. 
Das and Chatterjee’* made a proposal to overcome this defect by 
conceiving an encoder composed of many DM stages, each encoding 
the band-limited error signal of the preceding stage. In this way, a 
more accurate description of the tracking error is available at the 
receiver, and the exchange of s/n with transmitted bit-rate is greatly 
enhanced. This method of modulation is multistage delta modulation 
(MSDM). 

Initially it was claimed” that mspM had a better coding efficiency 
than conventional pcm for the same information rate, but subsequent 
work”’ using computer simulation up to 3-pM stages showed that 
although MspM is better than DM, it does not perform as well as PCM 
operating with “4o loading.” A theory of MSDM was presented by 
Franks, Schachter, and Shilling,’ together with computer simulation 
of a two-stage MSDM. Chakravarthy and Faruqui’” constructed a two- 
stage MSDM using adaptive DM, and refined the expressions for s/n 
previously propounded.” 

In spite of these endeavors, there appeared to be a need for more 
precise formulations of the peak s/n of MSDM. These expressions were 
found to involve the summation of the peak s/n of numerous DM 
stages; therefore, it became necessary to develop accurate, yet simple, 
expressions of s/n for a DM encoder. This was done, and the findings 
published separately® as part of the theory on linear pM. We now use 
these results for their intended purpose, namely, to provide simple © 
equations for the peak s/n of MSDM when encoding Gaussian input 
signals with band-limited Rc spectra. In the pursuit of this goal, we 
hope to provide new insight into the behavior of MsDM. Later in 
Section IV, we compare the s/n performance of MSDM with DM, PCM, 
and differential pulse code modulation (DPcM). It should be noted that 
we do not play advocate for MSDM, but merely endeavor to place its 
s/n performance in perspective. No attempt is made to judge its 
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relative complexity. We commence with a description of the principle 
of MSDM. 


ll. PRINCIPLE OF MSDM 


The simplest form of MSDM uses two delta modulators in the encoder 
as shown in Fig. 1. The analog input signal is band-limited by filter F 
to give the MSDM input signal x(t). The first delta modulator pm, 
encodes x(t) producing binary information representing the polarity of 
the tracking error e,(¢). In delta modulation, as distinct from Msp», 
the error in the recovered signal is the filtered e,(t) signal, namely 
é,(t). Two-stage MSDM reduces this error by encoding é)(¢) by a second 
delta modulator DM: and, thereby, making it available at the receiver. 
By this means, the overall error is reduced to the in-band error é2(t) of 
DM2. Observe that in our nomenclature a tilde (~) above a symbol 
means that it has been low-pass filtered by a filter having a linear 
phase-frequency characteristic. 

Thus, in the two-stage MSDM, the signals x(t) and é\(t) are encoded 
by DM, and DM; to yield binary signals L,(t) and Lo(t). Typically, both 
delta modulators will be clocked at the same rate f,, resulting in a 
transmitted bit-rate of f, = 2f,. At the receiver, the Zi(t) and L(t) 
signals are demultiplexed and decoded to give yi(t) and y2(t), respec- 
tively. A delay D is introduced in the first channel to compensate for 
the delay resulting from the band-limiting of e:(¢) by filter F at the 
input to DM2. The delayed decoded signal z2;(¢) is added to y2(t), and 
the noise residing outside the highest frequency f, in x(t) is removed 
by the final filter F, to yield a recovered signal m(t) that is a close 
approximation to x(t). We assume that the binary signals L,(¢) and 
L(t) are generated without error. 

The scheme is extendable to N delta modulators, each encoding the 
band-limited error signal of the preceding modulator and operating at 
a clock rate fi:, 1 = 1, 2, ---, N. Figure 2 shows an N-stage MSDM 
system. Signals x(t) and é,(t), R = 1, 2, ---, N — 1, are encoded by 
delta modulators DM;, k = 1, 2, ---, N, into binary signals L,(t), k = 
1, 2, ---, N. The binary signals are multiplexed, and transmitted at a 
bit-rate f, whose value is the sum of the DM sampling frequencies. 
After demultiplexing, each of the L,(t) signals are decoded into y,(¢), 
k = 1, 2, ---, N and delayed by the networks designated D,, k = 
1, 2, ---, N—1, in Fig. 2, with the exception of y(t). We will assume 
that filters F\, Fr, and F, are identical linear filters that impose a 
signal delay of ¢, seconds, whereas the delays associated with the 
networks D, are integer multiples of ¢,, being (N — 1)t, for the first 
channel (k= 1), and reducing by ¢, for subsequent channels. The 
locally decoded signals in each channel, suitably compensated by the 
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Fig. 1—Two-stage MSDM system. 
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Fig. 2—N-stage MSDM system. 


delays D, for the delays caused by filters F; in the encoder, are 
combined to give 
N-1 


c(t) = yn(é) + 2 zr(t), (1) 
where z;,(t) is the signal at the output of Dz. Now the locally decoded 
signals at the outputs of the integrators are 

yilt) = x(t) — er(t) 
and 
y(t) = Ex-i(t) — ex(t), k=2,3,---,N-1 
and after delay compensation they become 
zi(t) = x(t — (N — 1)t.) — eit — (N — 1)to) (2) 
and 
za(t) = p(t — (N — k)t.) — ex(t — (N — R)to), 
k=2,3,---,N-—1, (3) 
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respectively. Substituting z(t) and z;(t) from eqs. (2) and (8) into eq. 
(1) yields 


c(t) = x[t — (N — 1)t.] — en(t) 
N-1 


= 2 {ex[t — (N — k)t.] — &.[t —(N—k—-1]t.)}, (4) 


and upon filtering c(t) by filter F, to remove out-of-band noise, the 
recovered signal is obtained, 


m(t) = x(t — Nt.) — En(t). (5) 


This signal m(t) is composed of the input signal x(t) delayed by Nt, 
and a noise component that is the filtered error signal of the Nth stage 
DM. Observe that the error signals in each DM stage, with the exception 
of the last stage, cancel out because of the choice of delays D;, in the 
MSDM decoder channels. 
Therefore, the s/n of MSDM is 
Axe) 
s/n = 6 

(ER(E)) ’ 6) 
where ((-)) means time averaging of (-). Alternatively, eq. (6) can be 
expressed as 


_ (x*(t)) (é(t)) (én-1(¢)) 


the product of the s/n’s of each DM stage. The s/n in dBs is 

s/N = x S/Ni, (8) 
where 

S/Ni = 10 logyos/ni, (9) 


The upper case S/N in eq. (9) is in dBs, and the lower case s/n is a 
ratio. Thus, the S/N of an MSDM system is the sum of the S/Nn’s of each 
DM stage. The next problem is to determine these S/N’s in terms of 
DM parameters. 


lll. PEAK S/N OF MSDM 


We will assume that each DM stage has a step-size which produces 
peak s/n for that stage. For a Gaussian input signal x(¢) band-limited 
to frequency f., the peak s/n for the first DM stage in the MSDM encoder 
may be expressed as* 
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3 
s/nrc = ee (5) (4) , 001<SR<05 (10) 





opt, RC B ie 
and 
x 0177 (AN 
S/F = 5 (4) > (11) 
opt, F fe 


where the subscripts Rc and F refer to RC and flat Gaussian input 
signals, respectively. The flat Gaussian signal is band-limited white 
noise occupying the frequency band +f, and by Rc filtering this signal, 
the Rc Gaussian signal is obtained. The sampling frequency in eqs. 
(10) and (11) is fii; the break frequency of the Rc Gaussian input signal 
is fi; and £ and p are given by 


fi 
east (12 
B i. ) 
and 
_ tan™*(1/B) 
T= Btan "(1/8)" “ 
The optimum slope loading factors are 
f. 0.72 
Copurc = 1.3] toe) (14) 
and 
Copt,F = 0.5 + 0.722 we.(F). (15) 


Consequently, if the first delta modulator stage DM, is encoding an RC 
Gaussian input signal the peak s/n can be expressed from eq. (10) as 


s/nrc = Orc fei, (16) 


where @gc is a DM parameter whose value is evident from eq. (10), and 
fs is the clock frequency for DM,. Observe that @rc has a relatively 
weak dependence on f.; [see eq. (14)] and this dependency will be 
considered later in the calculation of the peak s/n of MSDM. 
Irrespective of the spectrum of x(t), the signals applied to subse- 
quent DM stages will be assumed to be flat Gaussian signals. This is a 
reasonable assumption as each DM, excluding the first, encodes the 
band-limited error signal from the preceding stage. To substantiate 
this assumption, we make the following points. Granular noise is 
known® to dominate slope overload noise when a delta modulator 
operates at its maximum s/n. Computer simulation results for granular 
noise when the s/n is close to its peak value have shown’ that the 
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noise spectrum is flat over the message bandwidth. Models have also 
been proposed” for pm that have flat noise spectra when the encoder 
is optimally loaded. Let us now consider the probability density 
function (PDF) of the filtered error signal €,(¢). Although the unfiltered 
DM error signal e(¢) is uniformly distributed over the range of twice 
the DM step-size,® the act of filtering e(¢) by a filter F, of bandwidth f, 
is to produce a signal é(¢) whose PDF becomes increasingly Gaussian as 
the ratio of f:/f- is increased. For the values of f,:/f. considered here, 
the Gaussian PDF assumption of the é(¢) signal is reasonable, and this, 
coupled with its flat spectral properties, supports our assertion that 
é(t) is a flat Gaussian signal. Finally, we note that even if the PDF of 
é(t) were not Gaussian, we would still be justified in treating é(t) as a 
flat Gaussian signal. This is because we only use the power properties 
of the signal in our calculations. The pPpF of é(¢) need not be considered 
because of the pM noise being predominately granular. 

From eq. (11), we will express the peak s/n of a delta modulator 
encoding a flat Gaussian input signal as 


s/nr; = Orf 3, i = 2; 3, CES og N, (17) 


where 6 is a DM parameter having a weak dependence on f,; [see eq. 
(15) ]. The value of 7 in eq. (17) denotes the DM stage number and f,; is 
the sampling frequency for the jth stage. 

The s/n in dB will be written in upper case letters. Thus, from eq. 
(8) and the preceding discussion, we have for the N stage MSDM, 


N 
S/N = S/Nrc + b> S/Nrj (18) 


J=2 
and from eqs. (16), (17), and (18) 
N 
S/N = 10 logioPrc + (N — 1)10 logioOe + ¥) 30 logo fii 
i=l 
= 10 logioPrc + (N — 1)10 logiofr + 30 logiodA, (19) 
where 
¥ N 
A= [I fa. (20) 
i=l 
The transmitted bit rate is 


N 
b= x fei (21) 


If ac and 6r were independent of the DM sampling rates fii, 1 = 
1, 2, ---, N, then the s/n of the MSDM would be maximized by 
maximizing A, subject to the constraint that f, in eq. (21) is constant. 
This would occur when 
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fs = foo = +++ = fon. (22) 


However, @rc and 6 depend on Copt.rc and Copt,r, and although these 
slope loading factors are functions of the sampling frequency f.;, the 
variation of @gc and 6 with f,; has only a minor effect on s/N. Thus, 
with each DM operating at the same sampling frequency, the S/N for 
the MSDM is close to its peak value. To each DM stage we will, therefore, 
assign the clock frequency 


_ fh 


enabling the S/N of eq. (19) to have a peak value of 
s/N = 8/Nrc + (N — 1)8/Nr. (24) 


The s/Nrc and s/Nr terms are the values of s/nrc and s/nr in dBs, 
where f,; in eqs. (10) and (11) is replaced by f,/N. Observe that s/Nr is 
independent of the Rc Gaussian input signal x(t) applied to the MSpM 
encoder, provided that DM, tracks it to produce a flat error signal 
spectrum. Thus, the s/N of MSDM is calculable from eqs. (10) and (11). 

Substituting the result of eq. (23) into eq. (19) enables the s/N to be 
expressed in terms of parameters Orc and 6p, the transmitted bit rate 
fp, and the number of stages N, namely, 


s/N = 10 logioOrc + (N — 1)10 logi0Or 


+ 30 logio fp : 
N 


= 10 logioPrc + (N — 1)10 logioOr 
+ 30N logio fp — 30N logioN. (25) 


Chakravarthy and Faruqui’ assumed the s/N to be N times the s/N 
for each DM stage. In their presentation, 9 was not explicitly derived; 
instead it was a system parameter. From their measurements, they 
concluded that the assumed s/N was too high and, accordingly, it was 
reduced by a factor H(N — 1), where H is an empirical constant, 
namely, 


3 
s/N = 10N lool dc( — H(N — 1). (26) 


The exponent 3 is a system parameter a in their formula as they 
considered delta modulators having local decoders composed of either 
single or double integration stages. From eqs. (25) and (26) we can 
specify their value of H for the MSDM using linear DM stages as 


0 ~ 
H=10 logo( $F) = 10 toto we (27) 
Or S/N 


F 
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or from eqs. (10) and (11), 


a 1 LB Contr ; 
a~ 101d 5 (5)(e) | 2 


As »./B is independent of f5:/f-, and Copt,.r/Copt,rc is nearly independent® 
of fs:/f-, we can appreciate why H was introduced as a constant. 

Returning to eq. (25) and replacing @rc and 6 by the DM parameters, 
the peak s/N of MSDM can be expressed in terms of the normalized 
transmitted bit rate f,/f-, namely, 


s/N = —4.77 — 7.52N — 20 logioCoptrc — 20(N — 1) logioCopt.r 





B fe 


where Copirc and Copter, given by eqs. (14) and (15) have sampling 
frequencies f,; equal to f,/N. The variation of the peak S/N of MSDM as 
expressed by eq. (29), is presented in Fig. 3 as a function of f,/f. for 
different values of N. The value of 8 used is 0.235, corresponding to 
frequency parameters f; and f. of 800 and 3,400 Hz, respectively. This 
value of B is often used**® when rc Gaussian signals are employed as 


+ 10 toso( 5) + 30N tog) — 30N logioN, (29) 
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Fig. 3—The s/N of MSDM as a function of f,/f- for N having values 1 to 10, and 
B = 0.235. 
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an approximation to band-limited speech signals in telephony. For 
monochrome luminance signals, 8 = 0.01 may be used,*® and although 
this smaller value of 8 increases the S/N of DM,, the shape of the curves 
in Fig. 3 are essentially the same. Thus, it is f,/f. and N that are the 
cardinal parameters governing the S/N of MSDM. 


3.1 Selecting N for maximum S/N 


From Fig. 3, we observe that for any f,/f- there is a particular 
number of DM stages to maximize the s/N. When f,/f. < 7.5, only one 
stage should be used (linear DM), and as f,/f- is increased the number 
of stages N should be increased appropriately. The maximum value of 
the normalized frequency associated with a given N for the maximum 
s/N is, for B = 0.235, 


fo = ee 1<=N<4. (30) 
fe 0.187 


For example, N = 3 gives the maximum s/N over the frequency range 
12.8 < f/f. < 18.2. The range of f,/f- associated with a value of N to 
give maximum s/N becomes progressively smaller as N is increased. 
Practical MSDM systems are unlikely to be produced with N > 4. Thus, 
given that N can be varied to maximize the s/N for any f,/f., we have 
from the curves of Fig. 3, 


S/Nmax = 5 + 11(8). (31) 
fe 

1V. COMPARING S/N OF MSDM WITH DM, PCM, AND DPCM 

4.1 Multistage delta modulation (MSDM) 

The s/N of MSDM, given by Eq. (29), increases at a rate of approxi- 
mately 7N dB/octave increase in fp/fc, for 32 < fp/fe < 128. At lower 
values of fp/fe, the variation of s/N with /,/f- is approximately 5.3N 
dB/octave. Selecting N to peak the s/N for any f,/fc, yields the S/Nmax 
of eq. (31). 

4.2 Delta modulation (DM) 
The s/1 N of linear DM is found by putting N = 1 in eq. (29), 


s/Npm = —12.3+ 10 togo( 5) — 20 logioCre + 30 togo(£) ’ (32) 


i.e., we may view DM as a special case of MSDM. In DM, the rate of 
improvement of s/N with f,/f- varies from 5 dB/octave for 2 < 
fo/f-e < 5, to 9 dB/octave for 32 < f,/f- < 128. This rate of improvement 
is significantly less than in MSDM [see eq. (31)]. 
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4.3 Pulse code modulation (PCM) 


The input signal is sampled at the Nyquist rate of 2f., and the 
transmitted bit rate f, is 2f-n, where n is the number of bits in the code 
words. For “40 loading,” the s/N in dB of linear Pc is” 


fe 


We observe that pcm is more efficient at exchanging s/N with f,/f-, or 
n, compared to both pM and MspM. However, at low values of fp/fe, 
PCM has a lower S/N than DM. 


s/Necm = —7.3+ 6n = -7.3 4+ 2(2) : (33) 


4.4 Differential pulse code modulation (DPCM) 


The s/N in dB of linear ppc is 


s/Nppcm =-—-A+6n + Gp =-A+ 3(2) + G), (34) 


where A is a constant, and G, is the prediction gain factor. The 
constant A depends on the probability density function of the error 
signal and can often be taken as ~7 dB, i.e.,’” 


s/Nppcm = S/Npcm + Gp. (35) 
The prediction gain factor is 
Gp = 10 logio(SFM), (36) 


and SFM is the ratio of the arithmetic mean to the geometric mean of 
the discrete spectrum of the input signal.’’ For a first-order optimum 
predictor 


Gp =-—10 logio(1 = 0”), (37) 


where p is the correlation between adjacent samples of the input signal. 
The autocovariance function of an Rc Gaussian input signal band- 
limited to f; has been found by O’Neal,” enabling us to formulate an 
approximate expression for p as 


p= | exp(-20f/f) 


hi {soso 


~ 4 Qah/f) 


where £ is defined by eq. (12), f, is the sampling rate, and Si is the sine 
integral function. When the input signal is sampled at the Nyquist 
rate, i.e., fs = 2f., eq. (88) reduces to 


+ Si(2zfe/fs) -3h) /1 — (2/7)B, (38) 
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e~7* + 0.07518 
1—(2/m)p ~ 


For a value of £ of 0.235, p = 0.54, resulting in G, = 1.12 dB. 

Thus, linear ppcm has an advantage G, over linear PcM, but it has 
the same type of dependence on f,/f-. Therefore, we conclude that 
MSDM, where the number of stages are selected according to eq. (30), 
has an S/N of the same form as PCM and DPCM, namely, 


p= (39) 


s/N = Pi + (2), (40) 
where P, and P2 are parameters that depend on the type of modulation, 
and for MSDM, DM, and DPcM, on the input signal. The MSDM operating 
with the optimum number of stages to peak the S/N is, therefore, more 
efficient at exchanging s/N with f>/f- than DM [see eq. (32)]. 

The variation of peak s/N as a function of f,/f- for an RC Gaussian 
input signal having B = 0.235 is shown in Fig. 4 for DM; MSDM with 
N = 4; MSDM employing the optimum number of stages; and PCM with 
“4o loading” and sampling at the Nyquist rate. The curve of ppcM for 
B = 0.235 and Nyquist sampling is not displayed as it has the same 
shape as that of pcm, but with s/N increased by Gp. 
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PCM optN MSDM,N=4 
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Fig. 4—The s/i N as a function of f,/f.- for linear DM, B= 0.235; MSDM, N = 4, B = 0.235; 
MSDM with optimum N, £ = 0.235; and pcm, fp = 2f., “4o loading.” 
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V. THE MSDM USING ADM STAGES 


An MSDM codec is required to operate over a wide range of input 
levels. In Section III, we considered the delta modulators to be linear 
with their step sizes adjusted to maximize their S/N values. When the 
input power of such an MSDM encoder is reduced causing DM, to 
operate in the granular region with a slope loading factor Crc < 8, the 
noise generated in DM; is substantially independent of the input signal 
power.’ As the filtered error signal of DM, is the input to DM2, and the 
power of this signal is approximately unaltered by the reduction of the 
input power to DM, the S/N of DMz2 remains the same. Because the 
power level to DM2 is unchanged, the S/n’s of subsequent DM stages 
are therefore similar, particularly for N = 4. As the input power to 
DM, is further reduced, the noise generated in DM, will eventually 
increase,’ causing some overloading in DM2 which, in turn, will reduce 
the S/Nn’s in subsequent DM stages. Thus, if excessive granular noise is 
generated in DM;, slope-overloading in the remaining delta modulators 
ensues. However, this slope-overloading of the second and subsequent 
DM Stages for a reduction in input level to DM operating in the granular 
region, is far less severe than when the input power is increased, 
causing DM, to become slope-overloaded. When DM; is slope-over- 
loaded, all the stages in the MSDM experience slope-overload, and the 
s/N of the MSDM rapidly deteriorates with increasing input power. 

The range of input power over which the MspM can operate, while 
providing an acceptable S/N, i.e., the dynamic range, can be greatly 
enhanced by using adaptive delta modulators (ADMs) instead of linear 
delta modulators. The diversity of ADMs is considerable,’ but all have 
the property of extending the dynamic range, while retaining an 
approximately constant s/N. The quantization noise power in ADM is, 
therefore, proportional to the signal power. As the input signal power 
varies, the filtered error signal applied to DM 2 varies, and if this encoder 
is not to be overloaded, it must also be adaptive. The same argument 
applies to subsequent stages, and hence the complete MSDM codec is 
composed of ADM stages. 

If the N ADM encoders in the MSDM codec are syllabically companded 
delta modulators’ [currently available on a chip’ in the form of contin- 
uously variable slope delta (CvsD) codecs], and they use single integra- 
tors in the local decoding process, then the peak s/N of each ADM stage 
is a close approximation to that given by eqs. (10) or (11). Further, the 
dynamic range of each codec where the S/N is maintained near its peak 
s/N is wide,” typically 40 dB. Thus, the maximum s/N of MSDM given 
by eq. (29) is a good approximation for MSDM having ADM stages over 
a wide range of input power. 
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VI. DISCUSSION 


Easily evaluated equations of peak S/N for MSDM have been derived 
in terms of normalized bit rate, number of DM stages, and the shape of 
the spectrum of the Rc Gaussian input signal. A suitable choice for the 
sampling rate for each delta modulator was found to be f,/N. For a 
given input signal bandwidth f., and transmission bit rate f,, there is 
an optimum number of stages which maximizes S/N. The variation of 
this s/N with f,/f- is given by eq. (29), and we showed in Section IV 
that this has the same form as for PCM and Dpcm. Thus, MSDM is more 
efficient than pM at exchanging s/N for fp/fc, but the s/N of MSDM is 
generally lower than that of Pcm and prc. At very low values of f,/f:, 
DM performs better than the other modulation methods considered 
(see Fig. 4). In Section V, we discussed MSDM with cvsD stages, 
concluding that the s/N of eq. (29) remains valid, provided that single- 
stage integrators are used in the cvsp codecs. Further gains in S/N are 
attainable if double stage integrators’ replace the single-stage inte- 
grators. The cvsD codecs enable the MSDM to have a wide dynamic 
range and, therefore, we envisage MSDM codecs to be constructed” 
with adaptive, rather than linear, delta modulators. 

Although this work has been concerned with the derivation of s/N 
of MSDM, we conclude with the observation that MSDM having four 
CvsD codecs might have a role to play in a variable bit-rate transmis- 
sion system. For example, each CvsD stage could operate at 16 kb/s 
giving a transmitted bit rate of 64 kb/s when the four stages are in use. 
In a time division multiplex (TDM) system, the MSDM codecs would 
attempt to operate at 64 kb/s, but as traffic increased they could 
discard the higher order CvsD stages, decreasing their bit rates from 64 
to 48 to 32 kb/s, until when the system is at maximum capacity, each 
MSDM would behave as a 16-kb/s CvsD codec. By using MSDM instead 
of a single-stage DM operating at the same bit rate, we are able to 
enhance the quality of the recovered signal as the bit rate increases 
from 16 kb/s. 
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Note on Some Factors Affecting Performance 
of Dynamic Time Warping Algorithms for 
Isolated Word Recognition 


By L. R. RABINER 
(Manuscript received October 14, 1981) 


To implement a dynamic time warping (DTW) algorithm for isolated 
word recognition, a number of factors must be specified. These factors 
include endpoint constraints, local path constraints, global path 
constraints, axis orientation, and local distance measure. Although 
a number of studies have been made to decide how best to choose 
these factors, several unresolved issues remain. The problems with 
choosing these factors become more complicated if the word reference 
patterns are not created from whole words, but instead are built up 
from subword units, e.g., syllables or demisyllables. In this paper, we 
consider an isolated word recognizer in which the reference patterns 
are obtained from a set of demisyllables as specified in a user- 
supplied lexicon. Different pronunciations of a word are represented 
by multiple entries in the lexicon. Because of the inherent boundaries 
in the reference patterns (at each demisyllable junction), it was felt 
that local constraints could influence performance more than for the 
standard whole-word case. Further, it was felt that some flexibility 
in the endpoint path constraint at the end of the word would be 
helpful, in general, for isolated word recognition caused by high 
variability at the ending of words with stops. A DTW algorithm was 
programmed and tested on an 1109-word vocabulary. Results showed 
small accuracy improvements when the path endpoint was allowed 
to vary across four frames (of either test or reference pattern). Loos- 
ening the local path constraints, however, had a significant degrad- 
ing effect on the performance. 


I. INTRODUCTION . | 
The technique of using dynamic time warping (DTW) to align (in 


time) a test and reference pattern for isolated word recognition has 
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been shown to be effective in a wide variety of recognition systems.’° 
Although a great deal of investigation has been made into “optimal” 
DTW algorithms, there still remains uncertainty as to how best to 
specify the factors of the pTW implementation to achieve the highest 
recognition accuracy. 

In this paper, we consider two of these factors, namely, the endpoint 
constraint at the end of the warp and the local path constraints. 
Previous investigations were made by Rabiner et al.* and Myers et al.° 
on the effects on word recognition accuracy of both loosening endpoint 
constraints and using different local path constraints. However, the 
work‘ on DTW algorithms with relaxed endpoint constraints considered 
only two specific variations, namely, the unconstrained endpoint case 
(the UE2-1 algorithm), and the local minimum case (the UELM 
algorithm). Neither of these algorithms considered relaxing just the 
endpoint constraint at the end of the word. This constraint is very 
important since replications of isolated words generally have the most 
variability at the end of the word. Similarly, although various local 
path constraints were considered in Ref. 6, the use of subword units in 
the reference patterns raises again the question as to whether increased 
flexibility in the choice of warping path leads to improvements in 
recognition scores. We show that by loosening the endpoint constraint 
at the end of the utterance, a small but consistent increase in recog- 
nition accuracy is obtained. It is also shown that the Itakura local path 
constraints yield higher recognition accuracies than generalized Itak- 
ura local constraints (Type III, Ref. 6). This result corroborates pre- 
vious findings*® which show that, in general, opening up of the region 
for DTW matching of words generally leads to degraded performance, 
since the improved score for the correct matches is offset by the 
improved scores for incorrect matches. 

The outline of this paper is as follows. In Section II, we briefly 
review the DTW implementation, and describe the factors which were 
studied. In Section III, we present and discuss results of a recognition 
test using a 1109-word vocabulary, and in Section IV, we summarize 
the findings. 


il. THE DTW IMPLEMENTATION 


Assume that we are given a test pattern T, consisting of a sequence 
of N vectors, 1.e., 


T = {T(1), TQ), ---, T(N)}, (1) 


where the vector T(z) is a spectral representation of the ith frame of 
the test word. The vectors, T(i), are a set of 9 autocorrelations (from 
which an 8th order linear predictive coding model is derived). The 
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duration of the test word is N frames, where each frame represents 45 
ms of speech, and adjacent frames are spaced 15 ms apart. 

For a given vocabulary of V words, we denote the reference pattern 
for the vth word as R,, and we represent each reference pattern as a 
sequence of M, vectors, i.e., 


R, = {R.(1), R, (2), meta, | R.(M.)} ’ (2) 


where each vector is again a spectral representation of the correspond- 
ing frame within the word. 

To optimally align the time scales of the test (the n index) and 
reference (the m index) patterns via a DTW algorithm, we must solve 
for a warping, or path alignment function of the form 


m= w(n) (3) 


and thereby seek to minimize the average distance 
be oe 
D=~— X a{T(n), R[w(n)]} (4) 
n=1 


over all possible w(n), where d is the local distance between test frame 
n and reference frame m = w(n). To solve the pTw problem requires 
specification of the following:® 
(t) endpoint constraints 
(it) local path constraints 
(iii) global path constraints 
(tv) axis orientation 
(v) local distance measure. 
In this paper, we have considered the effects, on recognition accuracy, 
of two of the ptw factors—the local endpoint constraints, and the local 
path constraints. In particular, we have considered the following DTW 
specifications: 


Endpoint Constraints—We assume the word endpoints of the test and 
reference patterns satisfy the path constraints 


w(1) = 1 (5a) 
M —- drs w(n) =M, N-drsnZNn, (5b) 


where dz is a range of frames, at the end of the reference pattern, and 
57 is a range for frames, at the end of the test pattern, where the warp 
path can terminate. Thus, the warp path begins at the point (1, 1) and 
ends in a region of dr frames from the end of the reference, and 67 
frames from the end of the test. 


Local Path Constraints—We assume that the local path satisfies one 
of the two sets of constraints shown in Fig. 1, namely, the Itakura? 
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path constraints (Fig. la) and the Myers et al.® Type III path con- 
straints (Fig. 1b). Both of these local path constraints satisfy the 
continuity equations 


0<w(n) —w(n-1) 52 (6a) 
w(n) — w(n — 1) =0> w(n — 1) — w(n — 2) > 0. (6b) 


The difference between the Itakura and the Type III path constraints 
is a subtle, but an important one, and it is related to the look-ahead 
(or look-back) capability of the local path constraints in finding the 
best path to a given grid point. This difference is illustrated in Fig. 2 
which shows three possible local paths to grid point (n, m). Path 1 
comes initially from grid point (n — 2, m — 1) and goes through grid 
point (n — 1, m) before terminating at grid point (n, m). Path 2 goes 
from grid point (n — 1, m — 1) to grid point (n, m). Path 3 goes from 
grid point (n — 1, m — 2) to grid point (n, m). For the Itakura 
constraints (with 1 level of look-ahead logic), if the best path to the 
intermediate grid point (n — 1, m) came from grid point (n — 2, m)— 
that is, it came across—then the possibility of path 1 being the best 
path to grid point (n, m) is not considered, whereas for Type III 
constraints (with two levels of look-ahead logic) path 1 is considered 
along with paths 2 and 3. At the end of a path (and often within the 
path), these differences can be significant. 


Global Path Constraints—The endpoint and local path equations 
constrain the range of frames of the reference pattern (the m-axis) for 
which the basic DTW recursion is done to: 


Mi(n) Sms Mu(n), (7) 
where 


My(n) = min{2(n — 1) + 1, M — (N — 67 — n)/2, M} (8a) 


(n-1,m) (n,m) (n-1,m) (n,m) 






(n-1,m-1) 





(n-2,m-1) 
(n-1,m-1) 





(n-1,m-2) 


(n-2, m-2) (n-1,m-2) 


(a) (b) 


Fig. 1—Two sets of ptw local path constraints. (a) Itakura constraints. (b) Type III 
constraints. 
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(n-2,m) (n-1,m) (n,m) 






(n-2, m-1) (n-1,m-1) 


(n-1,m-2) 


Fig. 2—Possible local paths for Itakura and for Type III local constraints. 


PATH ENDING REGION ~, 


\ 


n-87 | 





REFERENCE FRAME, (m) 





TEST FRAME, (n) 


Fig. 3—The region in the warping plane in which the dynamic time warping path can 
lie with ending uncertainty of 57 frames along the test and dz frames along the reference. 


My,(n) = max{(n — 1)/2 + 1, M — 2(N — n) — dz, 1}. (8b) 


These global path constraints essentially define a parallelogram (as 
shown in Fig. 3) with lines of slope 2 and 4% emanating from the points 
m=1,n=1, and from m = M — 62,n=N— 56r. 


Axis Orientation—We assume that the test sequence index 7 is always 
mapped to the abscissa and the reference sequence index m is always 
mapped to the ordinate. Experience has shown that this orientation 
leads to the best performance in isolated word recognition systems.*® 
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Local Distance Measure—The local distance measure used in this 
study is the Itakura log likelihood ratio” which is implemented in the 
form 


d[T(n), R(m)] = log[T(n)+R(m)], (9) 
i.e., a log of the dot product of the two vectors T(n) and R(m). 


2.1 Word recognition algorithm 


Word recognition is achieved by computing the optimal warping 
path and distance for each word in the vocabulary, giving 


Sas fens Ny 3 

Pye | aein | 2 €(re, Ratwon) | oy 
w(n) N-6Trs=NSN N n=1 

and using the nearest neighbor rule to choose u* as the best candidate 

where 


v* = argmin[D,]. (11) 


An ordered set of word distances is also maintained for statistical 
analysis purposes. 


2.2 Creation of reference patterns 


The reference patterns, R,, for the vth vocabulary word was created 
from a lexicon which contained one or more demisyllable-based spec- 
ifications for each word.’ The entries in the lexicon were obtained 
by converting a standard dictionary word pronunciation to a set of 
demisyllable tokens which could be concatenated to create the word. 
Variant word pronunciations were represented by multiple lexical 
entries. 

A set of 955-demisyllable tokens were used as the set of basis units 
from which each word was created. An 1109-word vocabulary (Basic 
English®) was used as the recognition vocabulary and a total of 1773 
lexical entries were used for the vocabulary. 

In creating the individual reference patterns from the demisyllable 
prototypes, a distinct boundary is created between each pair of demi- 
syllables. To minimize boundary effects, a nonlinear smoother was 
used to interpolate (in a minimum mean-squared error sense) the vocal 
tract log area ratios (i.e., the ratio between the areas of adjacent 
sections of a nine-section vocal tract model for each frame of speech) 
in a four-frame vicinity of the boundary. Because of the presence of 
one or more boundaries within each reference pattern, it was felt that 
the local path constraints could potentially have more effect on the 
recognition results than for the case of reference patterns created from 
isolated words. 
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2.3 Summary of DTW factors that were studied 


As discussed in this section, there are two DTW factors that were 
studied in the context of isolated word recognition from reference 
patterns created out of a corpus of demisyllables. These factors were 
as follows: 

(t) The range at the end of the test pattern, 5r—Two values of 
Sr were chosen, namely, 67 = 0 (the standard case) and 57 = 4 frames 
(60 ms), corresponding to the duration of a release of a burst, etc. 

(tt) The range at the end of the reference pattern, 6g—For reasons 
similar to those in the choice of values for 57, values of 0 and 4 were 
studied for dz. 

(iit) The local path constraint—Both the Itakura and the Type III 
constraints were studied in conjunction with variations of 8p and 67. 
Only one simple experimental evaluation was made, and the results 
are given in Section III. 


ill, EXPERIMENTAL EVALUATION 


To study the effects of the two factors discussed at the end of 
Section III on the recognition accuracy, an experiment was performed 
in the following way. The system was run as a speaker-trained, isolated 
word recognizer with a testing vocabulary of 1109 words (i.e., one 
replication of each word of the vocabulary) and a reference set of 1773 
patterns obtained from the lexical description of the 1109 word-vocab- 
ulary. The pTw algorithm (with the flexibility of varying the DTW 
factors) was programmed in FORTRAN on a Data General Eclipse 
S230 computer. (Normally, a CSP MAP 200 array processor is used to 
process the data at high speed; however, the flexible version of the 
DTW algorithm was not available on the MAP.) Each ptw alignment 
took on the order of 1 second on the Eclipse; hence, a complete test on 
a single set of DTw factors would have taken on the order of 5000 hours 
of computing. Clearly, this amount of computation is prohibitive. 
Thus, an alternative testing procedure was derived in which a “stand- 
ard” DTW was run on the CSP MAP (where 6x = dr = O and the 
Itakura local constraints were applied). The complete matrix of dis- 
tances (1109 words X 1773 references) was scanned and for each test 
word, the most highly confused alternative word was found (i.e., the 
word reference different from the test token with the smallest dis- 
tance). A second “word recognition” test was now performed on the 
Kclipse in which, for each test word, only two DTw distances (with the 
variable factors included) were measured. The DTw distances were, 
thus, obtained for a reference corresponding to the spoken word, and 
for a reference corresponding to the vocabulary word most similar to 
the spoken word. In this manner, the eight combinations of dz, 67, and 


DYNAMIC TIME WARPING ALGORITHM 369 


local constraints were varied and their effects on word recognition 
accuracy were measured. 

Before proceeding on to the results, some justification of this non- 
standard testing methodology must be given. It should be clear that 
for test words which were correctly recognized, and which had one or 
fewer close alternative words, such a procedure is acceptable for 
studying the effects of pTw parameters. It can also be seen that for 
cases in which the ptw factors degrade performance using the two 
candidate algorithms, performance with all candidates will be degraded 
even further. Hence, the only cases in which we cannot place total 
reliance on the results are those words with several close candidates 
for recognition. Hopefully, the number of such cases is small and will 
not greatly affect the results presented. 


3.1 Experimental results 


For each set of ptw factors (i.e., specification of dr, dr, and LC— 
local constraints) a set of three quantities were measured, namely: 

(t) Overall recognition accuracy—This quantity assumes that use 
of the two references best matching the test word was adequate when 
the ptw factors were varied. To the extent that this is the case, this 
measurement is the performance index of most concern. 

(ti) Average distance separation—This quantity measures the av- 
erage difference of the distances from the test to the reference pattern 
of the spoken word, and to the reference pattern of the closest com- 
petitor. It should be noted that the average distance separation was 
computed over all vocabulary words (including errors); hence, for 
errors the distance separation is negative and for correct recognitions 
it is positive. The average distance separation is one measure of 
separation of the distances for the correct words, and distances for 
nearest competitors. 

(tit) Number of occurrences of maximum distance separation over 
the eight factors—This quantity counted the number of times the 
particular set of ptw factors provided the best distance separation 
(over the eight sets of factors) for a given word, if it was recognized 
correctly. Hence, this measurement is a marginal count (conditioned 
on correct recognition) of the superiority (in terms of distance sepa- 
ration) of one set of factors over the seven alternative sets. 

The results of the word recognition experiment, in terms of the 
above three measurements, are given in Table I. The code LC = 0 is 
used to denote the Itakura local constraints; similarly, LC = 1 is used 
to denote the Type III local constraints. The following results can be 
seen from Table I: 

(t) The overall recognition accuracy using LC = 0 is from 2 to 3 
percent higher than when using LC = 1 for all choices of 6p and 67. 
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Table |—Performance results for DTW factors 


Number of 
Percent of Occurrences 
DTW Factors Overall Average of Maximum 


Recognition Distance Distance 


LC Sr or Accuracy Separation Separation 
0 0 0 76.4 0.044 356 
0 4 0 76.4 0.043 355 
0 0 0 76.9 0.045 483 
0 4 4 77.2 0.044 445 
1 0 0 74.4 0.039 291 
1 4 0 73.7 0.039 294 
1 0 4 74.8 0.040 377 
1 4 4 74.2 0.039 343 


(it) The variation in recognition accuracy for a fixed value of LC 
(and varying dz and 67) is small. 

(uit) The “best” recognition accuracy scores are for LC = 0, dr = 0 
or 4, and 67 = 4. 

(tv) The average distance separations for LC = 0 are about 10 
percent higher than for LC = 1. 

(v) The best performances in terms of occurrences of maximum 
distance separation are for LC = 0, 57 = 4, and dr = 4 and 0. Similarly 
for LC = 1, the cases 67 = 4 and 5p = 0 and 4 give the best performance 
in terms of this measurement. 


3.2 Discussion 


The results presented in Section 3.1 lead to the following two 
conclusions: 

(i) The Itakura local constraints (LC = 0) yield better performance 
than the more general Type III local constraints for all choices of 5z 
and 67, and for all three performance measurements. 

(vi) The selection of 5; = 4 (with dz = 0 or 4) is marginally better, 
in terms of recognition accuracy and average distance separation, than 
the selection of 6; = 0. However, in terms of maximum distance 
separation, the selection of dr = 4 is significantly better than the 
selection of 57 = 0. 

The first conclusion is expected in the sense that a great deal of 
earlier research has shown that any broadening of path regions invari- 
ably degrades performance for ptw algorithms,*® since the improve- 
ment in score for the correct reference is generally more than offset by 
the improvement in score for the nearest incorrect reference. Although 
it was anticipated that the special construction of the reference pat- 
terns (i.e., from concatenated demisyllable units) might mitigate this 
general result, the data of Table I shows that this is not the case. It is 
especially notable that even for the case of two carefully chosen 
reference patterns in the recognizer, this general conclusion is still 
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valid. This result tends to lend some credence to the somewhat unusual 
testing methodology. 

The second conclusion, although anticipated, is somewhat more 
difficult to explain. One expectation was that opening up the ending 
region (via nonzero values of 5g and for 67) would lead to the same 
effect noted in conclusion one, namely, degraded performance. This 
was definitely not the case here. A second expectation was that if 
opening up the ending region was a good thing to do, both dz and dr 
would have to be nonzero. Although nonzero values of both dz and dr 
led to the best performance, it is seen that dg = 0 was an almost 
equally good choice. The reason for this is the assymetry in the DTw 
implementation in which every single test frame is used in the warping 
path, but any given reference can be omitted entirely (since the path 
can jump by 2 frames). As such, at the end of the path, 2 of the 4 
ending reference frames could be skipped with dr = 0. Thus, the 
importance of a nonzero value of dr is greatly lessened, whereas a 
nonzero value of 57 clearly leads to improved performance. 

Additional analyses of the recognition results were performed to see 
if any simple correlations existed between cases in which improved 
recognitions occurred and specific linguistic events of the ends of the 
words, e.g., stops, nasals, fricatives, etc. No consistent and meaningful 
correlations were found. Hence, we conclude that the results of Table 
I are probably independent of the vocabulary items and the fact that 
the vocabulary was created from demisyllable tokens. 

A final comment concerns the recognition accuracy achieved for the 
1109-word vocabulary. Using whole word speaker-trained templates, 
Rabiner et al.’ achieved a recognition accuracy of 79.2 percent across 
6 talkers for the same 1109-word vocabulary. Thus, the best accuracy 
achieved using demisyllable-based templates (77.2 percent) compares 
favorably with the accuracy from whole word templates. 


IV. SUMMARY 


We have examined briefly the effects on isolated word recognition 
of three factors in the DTw alignment procedure. It was found that the 
Itakura local path constraints provided the best performance and that 
relaxing the ending constraint on the dynamic path also provided 
small, but consistent, improvements in performance. 
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LETTER TO THE EDITOR 


Comments on ‘‘Fractionally Spaced Equalization: An Improved Digital 
Transversal Equalizer,’’ by R. D. Gitlin and S. R. Weinstein* 


(Communication received November 1981) 


In the Gitlin and Weinstein paper, Fig. 3 and Formula (15) give the 
following result: the equalized signal, called w(t), is analytic, i.e., its 
real and imaginary parts are a Hilbert transform pair. This result is 
contradictory with the fact that in QAM, Re{u(t)} contains in-phase 
data a,, and Im{z(t)} contains independent quadrature data b,. It is 
worthwhile to make the three following points: 

(t) g(t) in Formula (13), is not analytic since w, is within the band 
of s(t). Therefore, g(t) cannot be written as g(t) + jq(t) as it was done 
on page 281. Consequently, z(t) is complex but not analytic and its 
real and imaginary parts are not Hilbert transform. g(t) and w(t) are 
analytic if w,. is on the edges of the band of s(¢) or outside this band as 
is the case in single sideband systems. 

(tt) In Formula (9), S(t) is analytic if and only if P(w) = 0; 
|w| = We. 

(iii) On page 280, it should be noted that f(t) should be given by 
the convolution of 4Xp(t)e~? with p(t), where 0 = X.X(w.). The factor 
% was indeed mentioned in the Appendix, Formula (62). 


G. Kawas-Kaleh 

Ecole Nationale Supérieure de Télécommunication 
Department Systemes et Communications 

46 Rue Barrault, 75634 Paris, CEDEX 13 


Author Response 


Mr. Kawas-Kaleh is correct in pointing out that g(t) is not an 
analytic signal. Since r(t) = r(t) + jr(t) is an analytic signal centered 
about w-, the demodulated signal g(t) will be a complex signal having 
frequency components centered about the origin, and hence cannot be 
an analytical signal. 

The condition that P(w) = 0 when |w| = we, which is implied though 
not explicitly stated in the paper, is standard practice. 

It should be noted that Mr. Kawas-Kaleh’s comments concerning 
analytic signals in no way affect the performance attributes of the 
fractionally spaced equalizer reported in the paper. 


R. D. Gitlin 


*B.S.T.J., 60, No. 2 (February 1981), pp. 275-96. 
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B.S.T.J. BRIEF 


Experimental Verification of 
Ultra-Wide Bandwidth Spectra in 
Double-Clad Single-Mode Fiber 


By S. J. JANG,* L. G. COHEN, W. L. MAMMEL, and M. A. SAIFI* 


(Manuscript received December 23, 1981) 


The low-loss and low-dispersion properties of single-mode fibers 
make them obvious choices for wide bandwidth system applications 
with very long repeater spans. This brief describes the fabrication 
procedure and transmission properties of a double-clad single-mode 
fiber which is capable of wide bandwidth (greater than 100 GHz-km 
for laser sources with 4-nm emission bandwidths) transmission over 
the widest wavelength range (1.45 ym to 1.73 ym) thus far reported in 
the literature. This range completely covers the lowest-loss wavelength 
window for fused-silica optical fibers. Double-clad lightguides’* were 
formed by using an inner cladding to form an index well between the 
core and a pure silica outer cladding. A computer-aided analytical 
procedure was used to choose the proper fiber diameter so that 
waveguide dispersion effects could be used to cancel material disper- 
sion at predetermined wavelengths.° 

The modified-chemical-vapor-deposition (MCVD) technique was used 
for preform fabrication. Sixty outer cladding layers, a composition of 
GeO2-P20;-SiO2 and fluorine, having the same refractive-index as pure 
silica were deposited by the MCvD method inside a 16- by 20-mm fused 
silica tube. The composition of six inner cladding layers is fluorine- 
doped silica in order to maintain a 0.35 percent negative index differ- 
ence relative to the outer cladding. Then six core layers of GeO2-SiO»2 


* Western Electric, Engineering Research Center, Princeton, N.J. 
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composition having a 0.55 percent positive index difference relative to 
the outer cladding were deposited. The refractive-index profile and 
diameter of the preform were measured by the laser beam refraction 
technique.° Figure 1 shows the refractive-index profile of a double-clad 
lightguide preform. The usual dopant burn-off from the center of the 
core during the high temperature collapsing step caused profile devia- 
tions from an intended step-index shape. The fiber was drawn from 
the preform at 2100°C using an RF-induction zirconia furnace and 
coated by a 50-um-thick UV-curable acrylate resin (EA-II) immedi- 
ately after exiting from the furnace. 

The preform index profile data in Fig. 1 were used in a computer 
analysis program’ to predict dispersion and bandwidth spectra for 
single-mode fibers. The optimal fiber diameter, 2a = 11 ym, was 
calculated to achieve the highest possible transmission bandwidths 
over the widest possible band of wavelengths within the lowest loss 
window for fused silica optical fibers. Calculated results in Fig. 2 show 
how dispersion effects caused by the structure of the double-clad 
waveguide (curve . . .) can be apportioned to cancel dispersion effects 
caused by germania and fluorine-doped fused silica materials 
(curve - - - -). Results obtained by summing points on those two curves 
predict the total chromatic dispersion (curve ——) if the preform 
profile in Fig. 1 is drawn into a fiber with a diameter of 2a = 11 ym. 

Figure 3 plots measurements of group delay versus wavelength 
obtained from 1-km-long fibers that were drawn from the preform 
characterized in Fig. 1. Fiber No. 1 had a diameter of 2a = 11 um and 
Fiber No. 2 had a diameter of 2a = 13.2 um. The curves were fitted to 
the data using a least-mean-square-fit method.’ Figure 4 shows chro- 
matic dispersion spectra obtained by differentiating the curve shown 
in Fig. 3. The predicted (curve... .) and measured (curve ——) values 
of chromatic dispersion for the optimal fiber core diameter (2a = 11 
pum) agree closely. Notice that the fiber dispersion is less than 0.665 
ps/ km X nm between the two spectral zero crossings at A = 1.495 um 
and 1.666 um. This low-dispersion spectral range, for Fiber No. 1, is 
nearly 2.5 times wider than for the curve, labeled Miya et al.,* corre- 
sponding to the best previously published result. Figure 5 shows 
transmission bandwidth spectra transformed’ from chromatic disper- 
sion spectra in Fig. 4. Results are applicable if Fibers No. 1 and No. 2 
are excited by a laser source with a 4-nm linewidth and can be rescaled 
for different linewidths. Notice that bandwidth values for Fiber No. 1 
are larger than 100 GHz-km for all wavelengths spanning the region 
from 1.45 pm to 1.73 um. Low loss values were maintained simultane- 
ously with low dispersion. Specific loss values were 0.8 dB/km at A = 
1.3 pm, 0.4 dB/km at A = 1.6 ym, and 4 dB/km near the center of the 
water absorption peak at A = 1.39 ym. 
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The importance of choosing optimal lightguide parameters can be 
appreciated by making comparisons between the dispersion and band- 
width spectra of Fiber No. 1 (solid line) and Fiber No. 2 (dashed line). 
Fiber No. 2, which has a cladding diameter (2a) about 2.2 um 
(or 20 percent) larger than the optimal value, shows only one zero 
dispersion wavelength. As a result, 100 GHz-km bandwidth values can 
be maintained only within a 0.026-1m wavelength range that is an 
order of magnitude narrower than the corresponding 0.28-um wave- 
length range for the optimal Fiber No. 1. 

In conclusion, this brief reports the fabrication procedure and trans- 
mission properties of a double-clad fiber which has potential for 
wavelength-division-multiplexing applications within wide wavelength 
ranges. A computer analysis program was used to determine the 
optimal lightguide diameter which demonstrated high bandwidths 
over the widest range of wavelengths (1.45 um to 1.73 ym) published 
to date. Further efforts are concentrating on fabricating fibers whose 
high bandwidth spectra cover the band from 1.3 ym to 1.55 pm. 
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Fig. 1—Refractive index profile of a double-clad single-mode fiber preform measured 
by the laser beam refraction technique. 
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Fig. 2—Dispersion spectral components predicted if the preform profile in Fig. 1 is 
drawn into a fiber with 2a = 11 ym diameter. 
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Fig. 3—Group delay spectra from 1-km length of Fibers No. 1 (2a = 11 pm) and No. 
2 (2a = 18.2 pm). 
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Fig. 4—Chromatic dispersion spectra calculated from Fig. 3 and predicted chromatic 
dispersion spectrum for diameter 2a = 5.5 um (dotted line). 
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Fig. 5—Bandwidth spectra calculated from Fig. 4. 
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